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CHLAMYDIA PNEUMONIAE GENOME SEQUENCE 



CROSS-REFERENCES TO RELATED APPLICATIONS 
The present application is related to 60/128,606, filed April 8, 1999 and 
60/108,279, filed November 12, 1998, which are incorporated herein by reference. 

STATEMENT AS TO RIGHTS TO INVENTIONS MADE UNDER 
FEDERALLY SPONSORED RESEARCH AND DEVELOPMENT 

FIELD OF THE INVENTION 
This invention relates to nucleic acids and polypeptides from Chlamydia 
pneumoniae and to their use in the diagnosis, prevention and treatment of diseases 
associated with C. pneumoniae. 

BACKGROUND OF THE INVENTION 

Chlamydiaceae is a family of obligate intracellular parasite with a tropism 
for epithelial cells lining the mucus membranes. The bacteria have two morphologically 
distinct forms, "elementary body" and "reticulate body". The elementary body is the 
infectious form, and has a rigid cell wall, primarily of cross-linked outer membrane 
proteins. The reticulate body is the intracellular, metabolically active form. A unique 
developmental cycle between these two forms characterizes Chlamydia growth. 

C pneumoniae is a human respiratory pathogen that causes acute 
respiratory disease, and approximately 10% of community-acquired pneumonia. 
Antibody prevalence studies have shown that virtually everyone is infected with C 
pneumoniae at some time, and that reinfection is common. In addition to respiratory 
disease, studies have shown an association of this organism with coronary artery disease. 
It has been demonstrated in atherosclerotic lesions of the aorta and coronary arteries by 
immunocytochemistry and by polymerase chain reaction (Kuo et ah (1993) J Infect Pis 
167(4):841-849). 

Recent reports have further demonstrated the presence of C. pneumoniae 
in the walls of abdominal aortic aneurysms (Juvonen et ah (1997) J Vase Surg 
25(3):499-505). Abdominal aortic aneurysms are frequently associated with 
atherosclerosis, and inflammation may be an important factor in aneurysmal dilatation. 
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C pneumoniae may play a role in maintaining an inflammation and triggering the 
development of aortic aneurysms. 

Muhlestein et al (1996) JACC 27:1555-61, reported a differential 
incidence of Chlamydia species within the coronary artery wall of patients with 
5 atherosclerosis versus those with other forms of cardiovascular disease. The extremely 
high rate of possible infection in patients with symptomatic atherosclerotic disease 
compared to the very low rate in patients with normal coronary arteries or coronary artery 
disease from chronic transplant rejection provides evidence for a direct link between the 
atherosclerotic process and Chlamydia infection. Because a history of chlamydial 

10 infection is so prevalent in the population, the issue of causality remains. On a 

physiologic and pathologic level, abnormal interactions among endothelial cells, platelets, 
macrophages and lymphocytes may lead to a cascade of events resulting in acute 
endothelial damage, thrombosis and repair, chronically leading to the development of 
atheroma in blood vessels. 

15 C pneumoniae is related to other Chlamydia species, but the level of 

sequence similarity is relatively low. Very little is known about the biology of this 
organism, although it appears to be an important human pathogen. Allelic diversity and 
structural relationships between specific genes of Chlamydial species is described in 
Kaltenboeck et al, (1993) J Bacterid 175(2):487-502; Gaydos et al (1992) Infect Immun 

20 60(12):53 19-5323; Everett et al. ( 1997) Int J Svst Bacteriol 47(2):46 1-473: and 
Pudjiatmoko et al (1997) Int J Svst Bacteriol 47(2):425-431. 

A number of studies have been published describing methods for detection 
of C pneumoniae, and for distinguishing between Chlamydial species. Such methods 
include PCR detection (Rasmussen et al (1992) Mol Cell Probes 6(5):389-394; Holland 

25 et al. (1990) J Infect Pis 162(4):984-987); a simplified polymerase chain reaction-enzyme 
immunoassay (Wilson et al. (1996) J Appl Bacteriol 80(4):431-438); sequence 
determination and restriction endonuclease cleavage (Herrmann et al (1996) J Clin 
Microbiol 34(8):1897-1902). 

Antigenic and molecular analyses of different C pneumoniae strains is 

30 described in Jantos et al (1997) J Clin Microbiol 35(3):620-623. Some genes of C 

pneumoniae have been isolated and sequenced. These include the Gro E operon (Kikuta 
et al. (1991) Infect Immun 59(12):4665-4669); the major outer membrane protein Perez et 
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al. (1991) Infect Immun 59(6):2 195-2 199; the DnaK protein homolog (Kornak et al. 
(1991) Infect Immun 59(2):72 1-725); as well as a number of ribosomal and other genes. 



5 

SUMMARY OF THE INVENTION 
This invention provides the genomic sequence of Chlamydia pneumoniae. 
The sequence information is useful for a variety of diagnostic and analytical methods. 
The genomic sequence may be embodied in a variety of media, including computer 
10 readable forms, or as a nucleic acid comprising a selected fragment of the sequence. 

Such fragments generally consist of an open reading frame, transcriptional or translational 
control elements, or fragments derived therefrom. Proteins encoded by the open reading 
frames are useful for diagnostic purposes, as well as for their enzymatic or structural 
activity. 

15 

DEFINITIONS 

The term "amino acid" refers to naturally occurring and synthetic amino 
acids, as well as amino acid analogs and amino acid mimetics that function in a manner 
similar to the naturally occurring amino acids. Naturally occurring amino acids are those 

20 encoded by the genetic code, as well as those amino acids that are later modified, e.g., 

hydroxyproline, y-carboxyglutamate, and O-phosphoserine. Amino acid analogs refers to 
compounds that have the same basic chemical structure as a naturally occurring amino 
acid, i.e., an a carbon that is bound to a hydrogen, a carboxyl group, an amino group, and 
an R group., e.g., homoserine, norleucine, methionine sulfoxide, methionine methyl 

25 sulfonium Such analogs have modified R groups (e.g., norleucine) or modified peptide 
backbones, but retain the same basic chemical structure as a naturally occurring amino 
acid. Amino acid mimetics refers to chemical compounds that have a structure that is 
different from the general chemical structure of an amino acid, but that functions in a 
manner similar to a naturally occurring amino acid. 

30 Amino acids may be referred to herein by either their commonly known 

three letter symbols or by the one-letter symbols recommended by the IUPAC-IUB 
Biochemical Nomenclature Commission. Nucleotides, likewise, may be referred to by 
their commonly accepted single-letter codes. 
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"Amplification" primers are oligonucleotides comprising either natural or 
analogue nucleotides that can serve as the basis for the amplification of a select nucleic 
acid sequence. They include, e.g., polymerase chain reaction primers and ligase chain 
reaction oligonucleotides. 
5 "Antibody" refers to an immunoglobulin molecule able to bind to a 

specific epitope on an antigen. Antibodies can be a polyclonal mixture or monoclonal. 
Antibodies can be intact immunoglobulins derived from natural sources or from 
recombinant sources and can be immunoreactive portions of intact immunoglobulins. 
Antibodies may exist in a variety of forms including, for example, Fv, F a t>, and F(ab)2, as 

10 well as in single chains. Single-chain antibodies, in which genes for a heavy chain and a 
light chain are combined into a single coding sequence, may also be used. 

An "antigen" is a molecule that is recognized and bound by an antibody, 
e.g., peptides, carbohydrates, organic molecules, or more complex molecules such as 
glycolipids and glycoproteins. The part of the antigen that is the target of antibody 

1 5 binding is an antigenic determinant and a small functional group that corresponds to a 
single antigenic determinant is called a hapten. 

"Biological sample" refers to any sample obtained from a living or dead 
organism. Examples of biological samples include biological fluids and tissue specimens. 
Such biological samples can be prepared for analysis of the presence of C. pneumoniae 

20 nucleic acids, proteins, or antibodies specifically reactive with the proteins. 

The term "C. pneumoniae gene" shall be intended to mean the open 
reading frame encoding specific C. pneumoniae polypeptides, as well as adjacent 5' and 
3' non-coding nucleotide sequences involved in the regulation of expression, up to about 
2 kb beyond the coding region, but possibly further in either direction. The gene may be 

25 introduced into an appropriate vector for extrachromosomal maintenance or for 
integration into a host genome. 

"Conservatively modified variants" applies to both amino acid and nucleic 
acid sequences. With respect to particular nucleic acid sequences, conservatively 
modified variants refers to those nucleic acids which encode identical or essentially 

30 identical amino acid sequences, or where the nucleic acid does not encode an amino acid 
sequence, to essentially identical sequences. Specifically, degenerate codon substitutions 
may be achieved by generating sequences in which the third position of one or more 
selected (or all) codons is substituted with mixed-base and/or deoxyinosine residues 
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(Batzere/a/., Nucleic Acid Res. 19:5081 (1991); Ohtsuka et aL, J. Biol Chem. 260:2605- 
2608 (1985); Rossolini et aL, MoL Cell Probes 8:91-98 (1994)). Because of the 
degeneracy of the genetic code, a large number of functionally identical nucleic acids 
encode any given protein. For instance, the codons GCA, GCC, GCG and GCU all 
5 encode the amino acid alanine. Thus, at every position where an alanine is specified by a 
codon, the codon can be altered to any of the corresponding codons described without 
altering the encoded polypeptide. Such nucleic acid variations are "silent variations," 
which are one species of conservatively modified variations. Every nucleic acid sequence 
herein which encodes a polypeptide also describes every possible silent variation of the 

10 nucleic acid. One of skill will recognize that each codon in a nucleic acid (except AUG, 
which is ordinarily the only codon for methionine, and TGG, which is ordinarily the only 
codon for tryptophan) can be modified to yield a functionally identical molecule. 
Accordingly, each silen: variation of a nucleic acid which encodes a polypeptide is 
implicit in each describ t& sequence. 

15 As to amino acid sequences, one of skill will recognize that individual 

substitutions, deletions or additions to a nucleic acid, peptide, polypeptide, or protein 
sequence which alters, adds or deletes a single amino acid or a small percentage of amino 
acids in the encoded sequence is a "conservatively modified variant" where the alteration 
results in the substitution of an amino acid with a chemically similar amino acid. 

20 Conservative substitution tables providing functionally similar amino acids are well 
known in the art. Such conservatively modified variants are in addition to and do not 
exclude polymorphic variants, interspecies homologs, and alleles of the invention. 

The following groups each contain amino acids that are conservative 
substitutions for one another: 



25 


1) 


Alanine (A), Glycine (G); 




2) 


Serine (S), Threonine (T); 




3) 


Aspartic acid (D), Glutamic acid (E); 




4) 


Asparagine (N), Glutamine (Q); 




5) 


Cysteine (C), Methionine (M); 


30 


6) 


Arginine (R), Lysine (K), Histidine (H); 




7) 


Isoleucine (I), Leucine (L), Valine (V); and 




8) 


Phenylalanine (F), Tyrosine (Y), Tryptophan (W) 



see, e.g., Creighton, Proteins (1984)). 



The terms "identical" or percent "identity," in the context of two or more 
nucleic acids or polypeptide sequences, refer to two or more sequences or subsequences 
that are the same or have a specified percentage of amino acid residues or nucleotides that 
are the same, when compared and aligned for maximum correspondence over a 
5 comparison window, as measured using one of the following sequence comparison 
algorithms or by manual alignment and visual inspection. This definition also refers to 
the complement of a test sequence, which has a designated percent sequence or 
subsequence complementarity when the test sequence has a designated or substantial 
identity to a reference sequence. For example, a designated amino acid percent identity 

10 of 95% refers to sequences or subsequences that have at least about 95% amino acid 
identity when aligned for maximum correspondence over a comparison window as 
measured using one of the following sequence comparison algorithms or by manual 
alignment and visual inspection. Such sequences would then be said to have substantial 
identity, or to be substantially identical to each other. Preferably, sequences have at least 

15 about 70% identity, more preferably 80% identity, more preferably 90-95% identity and 
above. Preferably, the percent identity exists over a region of the sequence that is at least 
about 25 amino acids in length, more preferably over a region that is 50-100 amino acids 
in length. 

When percentage of sequence identity is used in reference to proteins or 
20 peptides, it is recognized that residue positions that are not identical often differ by 
conservative amino acid substitutions, where amino acids residues are substituted for 
other amino acid residues with similar chemical properties (e.g., charge or 
hydrophobicity) and therefore do not change the functional properties of the molecule. 
Where sequences differ in conservative substitutions, the percent sequence identity may 
25 be adjusted upwards to correct for the conservative nature of the substitution. Means for 
making this adjustment are well known to those of skill in the art. Typically this involves 
scoring a conservative substitution as a partial rather than a full mismatch, thereby 
increasing the percentage sequence identity. Thus, for example, where an identical amino 
acid is given a score of 1 and a non-conservative substitution is given a score of zero, a 
30 conservative substitution is given a score between zero and L The scoring of 

conservative substitutions is calculated according to, e.g., the algorithm of Meyers & 
Miller, Computer Applic. Biol ScL 4:11-17 (1988) e.g., as implemented in the program 
PC/GENE (Intelligenetics, Mountain View, California, USA).. 



For sequence comparison, typically one sequence acts as a reference 
sequence, to which test sequences are compared. When using a sequence comparison 
algorithm, test and reference sequences are entered into a computer, subsequence 
coordinates are designated, if necessary, and sequence algorithm program parameters are 
5 designated. Default program parameters can be used, or alternative parameters can be 
designated. The sequence comparison algorithm then calculates the percent sequence 
identity for the test sequence(s) relative to the reference sequence, based on the 
designated or default program parameters. 

A comparison window includes reference to a segment of any one of the 

10 number of contiguous positions selected from the group consisting of from 25 to 600, 
usually about 50 to about 200, more usually about 100 to about 150 in which a sequence 
may be compared to a reference sequence of the same number of contiguous positions 
after the two sequences are optimally aligned. Methods of alignment of sequences for 
comparison are well-known in the art. Optimal alignment of sequences for comparison 

15 can be conducted, e.g., by the local homology algorithm of Smith & Waterman, Adv. 
Appi Math. 2:482 (1981), by the homology alignment algorithm of Needleman & 
Wunsch, J. Mol Biol 48:443 (1970), by the search for similarity method of Pearson & 
Lipman, Proc. Nat 'I Acad. Sci. USA 85:2444 (1988), by computerized implementations 
of these algorithms (GAP, BESTFIT, FAST A, and TFASTA in the Wisconsin Genetics 

20 Software Package, Genetics Computer Group, 575 Science Dr., Madison, WI), or by 
manual alignment and visual inspection {see, e.g., Ausubel et aL, supra). 

One example of a useful algorithm is PILEUP. PILEUP creates a multiple 
sequence alignment from a group of related sequences using progressive, pairwise 
alignments to show relationship and percent sequence identity. It also plots a tree or 

25 dendogram showing the clustering relationships used to create the alignment. PILEUP 
uses a simplification of the progressive alignment method of Feng & Doolittle, J. Mol. 
Evol. 35:351-360 (1987). The method used is similar to the method described by Higgins 
& Sharp, CABIOS 5:151-153 (1989). The program can align up to 300 sequences, each 
of a maximum length of 5,000 nucleotides or amino acids. The multiple alignment 

30 procedure begins with the pairwise alignment of the two most similar sequences, 

producing a cluster of two aligned sequences. This cluster is then aligned to the next 
most related sequence or cluster of aligned sequences. Two clusters of sequences are 
aligned by a simple extension of the pairwise alignment of two individual sequences. The 



final alignment is achieved by a series of progressive, pairwise alignments. The program 
is run by designating specific sequences and their amino acid or nucleotide coordinates 
for regions of sequence comparison and by designating the program parameters. Using 
PILEUP, a reference sequence is compared to other test sequences to determine the 
5 percent sequence identity relationship using the following parameters: default gap weight 
(3.00), default gap length weight (0.10), and weighted end gaps. PILEUP can be obtained 
from the GCG sequence analysis software package, e.g, version 7.0 (Devereaux et al, 
Nuc. Acids Res. 12:387-395 (1984). 

Another example of algorithm that is suitable for determining percent 

10 sequence identity (i.e., substantial similarity or identity) is the BLAST algorithm, which 
is described in Altschul et al, 1 MoL Biol 215:403-410 (1990). Software for performing 
BLAST analyses is publicly available through the National Center for Biotechnology 
Information (http://www.ncbi.nlm.nih.gov/). This algorithm involves first identifying 
high scoring sequence pairs (HSPs) by identifying short words of length W in the query 

1 5 sequence, which either match or satisfy some positive-valued threshold score T when 
aligned with a word of the same length in a database sequence. T is referred to as the 
neighborhood word score threshold (Altschul et al, supra). These initial neighborhood 
word hits act as seeds for initiating searches to find longer HSPs containing them. The 
word hits are then extended in both directions along each sequence for as far as the 

20 cumulative alignment score can be increased. Cumulative scores are calculated using, for 
nucleotide sequences, the parameters M (reward score for a pair of matching residues; 
always > 0) and N (penalty score for mismatching residues, always < 0). For amino acid 
sequences, a scoring matrix is used to calculate the cumulative score. Extension of the 
word hits in each direction are halted when: the cumulative alignment score falls off by 

25 the quantity X from its maximum achieved value; the cumulative score goes to zero or 
below, due to the accumulation of one or more negative-scoring residue alignments; or 
the end of either sequence is reached. The BLAST algorithm parameters W, T, and X 
determine the sensitivity and speed of the alignment. The BLASTN program (for 
nucleotide sequences) uses as defaults a wordlength (W) of 1 1, an expectation (E) of 10, 

30 M=5, N=4, and a comparison of both strands. For amino acid sequences, the BLASTP 
program uses as default parameters a wordlength (W) of 3, an expectation (E) of 10, and 
the BLOSUM62 scoring matrix {see Henikoff & Henikoff, Proc. Natl Acad. ScL USA 
89:10915 (1989)). 



The BLAST algorithm also performs a statistical analysis of the similarity 
between two sequences {see, e.g., Karlin & Altschul, Proc. Nat'L Acad. ScL USA 
90:5873-5787 (1993)). One measure of similarity provided by the BLAST algorithm is 
the smallest sum probability (P(N)), which provides an indication of the probability by 
5 which a match between two nucleotide or amino acid sequences would occur by chance. 
For example, a nucleic acid is considered similar to a reference sequence if the smallest 
sum probability in a comparison of the test nucleic acid to the reference nucleic acid is 
less than about 0.1, more preferably less than about 0.01, and most preferably less than 
about 0.00 L 

10 An indication that two nucleic acid sequences or polypeptides are 

substantially identical is that the polypeptide encoded by the first nucleic acid is 
immunologically cross reactive with the antibodies raised against the polypeptide 
encoded by the second nucleic acid, as described below. Thus, a polypeptide is typically 
substantially identical to a second polypeptide, for example, where the two peptides differ 

15 only by conservative suostitutions. Another indication that two nucleic acid sequences 
are substantially identical is that the two molecules or their complements hybridize to 
each other under stringent conditions, as described below. 

Another indication that polynucleotide sequences are substantially 
identical is if two molecules hybridize to each other under stringent conditions. Stringent 

20 conditions are sequence dependent and will be different in different circumstances. 
Generally, stringent conditions are selected to be about 5°C lower than the thermal 
melting point (Tm) for the specific sequence at a defined ionic strength and pH. The Tm 
is the temperature (under defined ionic strength and pH) at which 50% of the target 
sequence hybridizes to a perfectly matched probe. Typically stringent conditions for a 

25 Southern blot protocol involve hybridizing in a buffer comprising 5x SSC, 1% SDS at 
65°C or hybridizing in a buffer containing 5x SSC and 1% SDS at 42°C and washing at 
65°C with a 0.2x SSC, 0.1% SDS wash. 

A "label" is a composition detectable by spectroscopic, photochemical, 
biochemical, immunochemical, or chemical means. For example, useful labels include 

30 32 P, fluorescent dyes, electron-dense reagents, enzymes (e.g., as commonly used in an 
ELISA), biotin, dioxigenin, or haptens and proteins for which antisera or monoclonal 
antibodies are available. 
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The term "nucleic acid" refers to deoxyribonucleotides or ribonucleotides 
and polymers thereof in either single- or double-stranded form. The term encompasses 
nucleic acids containing known nucleotide analogs or modified backbone residues or 
linkages, which are synthetic, naturally occurring, and non-naturally occurring, which 
5 have similar binding properties as the reference nucleic acid, and which are metabolized 
in a manner similar to the reference nucleotides. Examples of such analogs include, 
without limitation, phosphorothioates, phosphoramidates, methyl phosphonates, 
chiral-methyl phosphonates, 2-O-methyl ribonucleotides, peptide-nucleic acids (PNAs). 
Unless otherwise indicated, a particular nucleic acid sequence also 

10 implicitly encompasses conservatively modified variants thereof (e.g., degenerate codon 
substitutions) and complementary sequences, as well as the sequence explicitly indicated. 
The term nucleic acid is used interchangeably with gene, cDNA, mRNA, oligonucleotide, 
and polynucleotide. 

As used herein a "nucleic acid probe or oligonucleotide" is defined as a 

15 nucleic acid capable of binding to a target nucleic acid of complementary sequence 
through one or more types of chemical bonds, usually through complementary base 
pairing, usually through hydrogen bond formation. As used herein, a probe may include 
natural (i.e., A, G, C, or T) or modified bases (7-deazaguanosine, inosine, etc.). In 
addition, the bases in a probe may be joined by a linkage other than a phosphodiester 

20 bond, so long as it does not interfere with hybridization. Thus, for example, probes may 
be peptide nucleic acids in which the constituent bases are joined by peptide bonds rather 
than phosphodiester linkages. It will be understood by one of skill in the art that probes 
may bind target sequences lacking complete complementarity with the probe sequence 
depending upon the stringency of the hybridization conditions. The probes are preferably 

25 directly labeled as with isotopes, chromophores, lumiphores, chromogens, or indirectly 
labeled such as with biotin to which a streptavidin complex may later bind. By assaying 
for the presence or absence of the probe, one can detect the presence or absence of the 
select sequence or subsequence. 

A labeled nucleic acid probe or oligonucleotide is one that is bound, either 

30 covalently, through a linker, or through ionic, van der Waals or hydrogen bonds to a label 
such that the presence of the probe may be detected by detecting the presence of the label 
bound to the probe. 
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"Pharmaceutically acceptable" means a material that is not biologically or 
otherwise undesirable, i.e., the material can be administered to an individual along with a 
Chlamydia antigen without causing any undesirable biological effects or interacting in a 
deleterious manner with any of the other components of the pharmaceutical composition. 
5 The terms "polypeptide," "peptide" and "protein" are used interchangeably 

herein to refer to a polymer of amino acid residues. The terms apply to amino acid 
polymers in which one or more amino acid residue is an analog or mimetic of a 
corresponding naturally occurring amino acid, as well as to naturally occurring amino 
acid polymers. 

10 The phrase "specifically or selectively hybridizing to," refers to 

hybridization between a probe and a target sequence in which the probe binds 
substantially only to the target sequence, forming a hybridization complex, when the 
target is in a heterogeneous mixture of polynucleotides and other compounds. Such 
hybridization is determinative of the presence of the target sequence. Although the probe 

15 may bind other unrelated sequences, at least 90%, preferably 95% or more of the 
hybridization complexes formed are with the target sequence. 

The term "recombinant" when used with reference to a cell, or nucleic 
acid, or vector, indicates that the cell, or nucleic acid, or vector, has been modified by the 
introduction of a heterologous nucleic acid or the alteration of a native nucleic acid, or 

20 that the cell is derived from a cell so modified. Thus, for example, recombinant cells 

express genes that are not found within the native (non-recombinant) form of the cell or 
express native genes that are otherwise abnormally expressed, under expressed or not 
expressed at all. 

The phrase "specifically immunoreactive with", when referring to a protein 
25 or peptide, refers to a binding reaction between the protein and an antibody which is 

determinative of the presence of the protein in the presence of a heterogeneous population 
of proteins and other compounds. Thus, under designated immunoassay conditions, the 
specified antibodies bind to a particular protein and do not bind in a significant amount to 
other proteins present in the sample. Specific binding to an antibody under such 
30 conditions may require an antibody that is selected for its specificity for a particular 

protein. A variety of immunoassay formats may be used to select antibodies specifically 
immunoreactive with a particular protein and are described in detail below. 



11 



The phrase "substantially pure" or "isolated" when referring to a 
Chlamydia peptide or protein, means a chemical composition which is free of other 
subcellular components of the Chlamydia organism. Typically, a monomeric protein is 
substantially pure when at least about 85% or more of a sample exhibits a single 
5 polypeptide backbone. Minor variants or chemical modifications may typically share the . 
same polypeptide sequence. Depending on the purification procedure, purities of 85%, 
and preferably over 95 % pure are possible. Protein purity or homogeneity may be 
indicated by a number of means well known in the art, such as polyacrylamide gel 
electrophoresis of a protein sample, followed by visualizing a single polypeptide band on 
1 0 a polyacrylamide gel upon silver staining. For certain purposes high resolution will be 
needed and HPLC or a similar means for purification utilized. 

DETAILED DESCRIPTION 
The present invention provides the nucleotide sequence of the C 

15 pneumoniae genome SEQ ID NO: 1 or a representative fragment thereof, in a form which 
can be readily used, analyzed, and interpreted by a skilled artisan. As used herein, a 
"representative fragment" of the nucleotide sequence depicted in SEQ ID NO: 1 refers to 
any portion which is not presently represented within a publicly available database. 
Preferred representative fragments of the present invention are open reading frames, 

20 expression modulating fragments, uptake modulating fragments, and fragments which can 
be used to diagnose the presence of C pneumoniae in sample. Using the information 
provided in the present application, together with routine cloning and sequencing 
methods, one of ordinary skill in the art will be able to clone and sequence all 
"representative fragments" of interest including open reading frames (ORPs) encoding a 

25 large variety of C pneumoniae proteins. A non-limiting identification of such preferred 
representative fragments is provided in Tables 2 and 3. 

Diagnostic use of C. pneumoniae nucleic acids 

Hvbridization-based assays 

Using the nucleic acids disclosed here, one of skill can design nucleic acid 
30 hybridization-based assays for the detection of C pneumoniae. Any of a number of well 
known techniques for the specific detection of target nucleic acids can be used. ■ 
Exemplary hybridization-based assays include, but are not limited to, traditional "direct 
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probe" methods such as Southern Blots, dot blots, in situ hybridization (e.g., FISH), PCR, 
and the like. The methods can be used in a wide variety of formats including, but not 
limited to substrate- (e.g. membrane or glass) bound methods or array-based approaches 
as described below. As noted above, this invention also embraces methods for detecting 
5 the presence of Chlamydia DNA or RNA in biological samples. These sequences can be 
used to detect Chlamydia in biological samples from patients suspected of being infected. 
A variety of methods of specific DNA and RNA measurement using nucleic acid 
hybridization techniques are known to those of skill in the art (see Sambrook et ai, 
supra). 

10 In situ h>bridization assays are well known (e.g., Angerer (1987) Meth. 

Enzymol 152: 649). Generally, in situ hybridization comprises the following major steps: 
(1) fixation of tissue or biological structure to analyzed; (2) prehybridization treatment of 
the biological structure to increase accessibility of target DNA, and to reduce nonspecific 
binding; (3) hybridizatic n of the mixture of nucleic acids to the nucleic acid in the 

15 biological structure or tissue; (4) post-hybridization washes to remove nucleic acid 

fragments not bound in the hybridization and (5) detection of the hybridized nucleic acid 
fragments. The reagent used in each of these steps and the conditions for use vary 
depending on the particular application. 

In a typical in situ hybridization assay, cells are fixed to a solid support, 

20 typically a glass slide. If a nucleic acid is to be probed, the cells are typically denatured 
with heat or alkali. The cells are then contacted with a hybridization solution at a 
moderate temperature to permit annealing of labeled probes specific to the nucleic acid 
sequence encoding the protein. The targets (e.g., cells) are then typically washed at a 
predetermined stringency or at an increasing stringency until an appropriate signal to 

25 noise ratio is obtained. 

The nucleic acids of this invention are particularly well suited to array- 
based hybridization formats. Arrays are a multiplicity of different "probe" or "target" 
nucleic acids (or other compounds) attached to one or more surfaces (e.g., solid, 
membrane, or gel). In a preferred embodiment, the multiplicity of nucleic acids (or other 

30 moieties) is attached to a single contiguous surface or to a multiplicity of surfaces 
juxtaposed to each other. 

In an array format a large number of different hybridization reactions can 
be run essentially "in parallel." This provides rapid, essentially simultaneous, evaluation 
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of a number of hybridizations in a single "experiment". Methods of performing 
hybridization reactions in array based formats are well known to those of skill in the art 
(see, e.g., Pastinen (1997) Genome Res. 7: 606-614; Jackson (1996) Nature 
Biotechnology 14:1685; Chee (1995) Science 274: 610; WO 96/17958. 
5 Arrays, particularly nucleic acid arrays can be produced according to a 

wide variety of methods well known to those of skill in the art. For example, in a simple 
embodiment, "low density" arrays can simply be produced by spotting (e.g. by hand using 
a pipette) different nucleic acids at different locations on a solid support (e.g. a glass 
surface, a membrane, etc.). 

10 This simple spotting, approach has been automated to produce high 

density spotted arrays (see, e.g., U.S. Patent No: 5,807,522). This patent describes the 
use of an automated systems that taps a microcapillary against a surface to deposit a small 
volume of a biological sample. The process is repeated to generate high density arrays. 
Arrays can also be produced using oligonucleotide synthesis technology. Thus, for 

15 example, U.S. Patent No. 5,143,854 and PCT patent publication Nos. WO 90/15070 and 
92/10092 teach the use of light-directed combinatorial synthesis of high density 
oligonucleotide arrays. 

Many methods for immobilizing nucleic acids on a variety of solid 
surfaces are known in the art. A wide variety of organic and inorganic polymers, as well 

20 as other materials, both natural and synthetic, can be employed as the material for the 

solid surface. Illustrative solid surfaces include, e.g., nitrocellulose, nylon, glass, quartz, 
diazotized membranes (paper or nylon), silicones, polyformaldehyde, cellulose, and 
cellulose acetate. In addition, plastics such as polyethylene, polypropylene, polystyrene, 
and the like can be used. Other materials which may be employed include paper, 

25 ceramics, metals, metalloids, semiconductive materials, cermets or the like. In addition, 
substances that form gels can be used. Such materials include, e.g., proteins (e.g., 
gelatins), lipopoly saccharides, silicates, agarose and polyacrylamides. Where the solid 
surface is porous, various pore sizes may be employed depending upon the nature of the 
system. 

30 In preparing the surface, a plurality of different materials may be 

employed, particularly as laminates, to obtain various properties. For example, proteins 
(e.g., bovine serum albumin) or mixtures of macromolecules (e.g., Denhardt's solution) 
can be employed to avoid non-specific binding, simplify covalent conjugation, enhance 
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signal detection or the like. If covalent bonding between a compound and the surface is 
desired, the surface will usually be polyfiinctional or be capable of being 
polyfunctionalized. Functional groups which may be present on the surface and used for 
linking can include carboxylic acids, aldehydes, amino groups, cyano groups, ethylenic 
5 groups, hydroxyl groups, mercapto groups and the like. The manner of linking a wide 
variety of compounds to various surfaces is well known and is amply illustrated in the 
literature. 

For example, methods for immobilizing nucleic acids by introduction of 
various functional groups to the molecules is known {see, e.g., Bischoff (1987) Anal 

10 Biochem., 164: 336-344; Kremsky (1987) Nucl Acids Res. 15: 2891-2910). Modified 
nucleotides can be placed on the target using PCR primers containing the modified 
nucleotide, or by enzymatic end labeling with modified nucleotides. Use of glass or 
membrane supports (e.g., nitrocellulose, nylon, polypropylene) for the nucleic acid arrays 
of the invention is advantageous because of well developed technology employing 

15 manual and robotic methods of arraying targets at relatively high element densities. Such 
membranes are generally available and protocols and equipment for hybridization to 
membranes is well known. 

Target elements of various sizes, ranging from 1 mm diameter down to 1 
jam can be used. Smaller target elements containing low amounts of concentrated, fixed 

20 probe DNA are used for high complexity comparative hybridizations since the total 

amount of sample available for binding to each target element will be limited. Thus it is 
advantageous to have small array target elements that contain a small amount of 
concentrated probe DNA so that the signal that is obtained is highly localized and bright. 
Such small array target elements are typically used in arrays with densities greater than 

25 10 4 /cm 2 . Relatively simple approaches capable of quantitative fluorescent imaging of 1 
cm 2 areas have been described that permit acquisition of data from a large number of 
target elements in a single image (see, e.g., Wittrup (1994) Cytometry 16:206-213). 

If fluorescently labeled nucleic acid samples are used, arrays on solid 
surface substrates with much lower fluorescence than membranes, such as glass, quartz, 

30 or small beads, can achieve much better sensitivity. Substrates such as glass or fused 
silica are advantageous in that they provide a very low fluorescence substrate, and a 
highly efficient hybridization environment. Covalent attachment of the target nucleic 
acids to glass or synthetic fused silica can be accomplished according to a number of 
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known techniques (described above). Nucleic acids can be conveniently coupled to glass 
using commercially available reagents. For instance, materials for preparation of 
silanized glass with a number of functional groups are commercially available or can be 
prepared using standard techniques (see, e.g., Gait (1984) Oligonucleotide Synthesis: A 
5 Practical Approach, IRL Press, Wash., D.C.). Quartz cover slips, which have at least 10- 
fold lower autofluorescence than glass, can also be silanized. 

Alternatively, probes can also be immobilized on commercially available 
coated beads or other surfaces. For instance, biotin end-labeled nucleic acids can be 
bound to commercially available avidin-coated beads. Streptavidin or anti-digoxigenin 

10 antibody can also be attached to silanized glass slides by protein-mediated coupling using 
e.g., protein A following standard protocols (see, e.g., Smith (1992) Science 258: 1 122- 
1 126). Biotin or digoxigenin end-labeled nucleic acids can be prepared according to 
standard techniques. Hybridization to nucleic acids attached to beads is accomplished by 
suspending them in the hybridization mix, and then depositing them on the glass substrate 

15 for analysis after washing. Alternatively, paramagnetic particles, such as ferric oxide 
particles, with or without avidin coating, can be used. 

A variety of other nucleic acid hybridization formats are known to those 
skilled in the art. For example, common formats include sandwich assays and 
competition or displacement assays. Hybridization techniques are generally described in 

20 Hames and Higgins (1985) Nucleic Acid Hybridization, A Practical Approach, IRL Press; 
Gall and Pardue (1969) Proc. Natl. Acad Set. USA 63: 378-383; and John et al (1969) 
Nature 223: 582-587. 

Sandwich assays are commercially useful hybridization assays for 
detecting or isolating nucleic acid sequences. Such assays utilize a "capture" nucleic acid 

25 covalently immobilized to a solid support and a labeled "signal" nucleic acid in solution. 
The sample will provide the target nucleic acid. The "capture" nucleic acid and "signal" 
nucleic acid probe hybridize with the target nucleic acid to form a "sandwich" 
hybridization complex. To be most effective, the signal nucleic acid should not hybridize 
with the capture nucleic acid. 

30 Detection of a hybridization complex may require the binding of a signal 

generating complex to a duplex of target and probe polynucleotides or nucleic acids. 
Typically, such binding occurs through ligand and anti-ligand interactions as between a 
ligand-conjugated probe and an anti-ligand conjugated with a signal. 

16 



The sensitivity of the hybridization assays may be enhanced through use of 
a nucleic acid amplification system that multiplies the target nucleic acid being detected. 
Examples of such systems include the polymerase chain reaction (PCR) system and the 
ligase chain reaction (LCR) system. Other methods recently described in the art are the 
5 nucleic acid sequence based amplification (NASBAO, Cangene, Mississauga, Ontario) 
and Q Beta Replicase systems. 

Nucleic acid hybridization simply involves providing a denatured probe 
and target nucleic acid under conditions where the probe and its complementary target 
can form stable hybrid duplexes through complementary base pairing. The nucleic acids 

10 that do not form hybrid duplexes are then washed away leaving the hybridized nucleic 
acids to be detected, typically through detection of an attached detectable label. It is 
generally recognized that nucleic acids are denatured by increasing the temperature or 
decreasing the salt concentration of the buffer containing the nucleic acids, or in the 
addition of chemical agents, or the raising of the pH. Under low stringency conditions 

1 5 (e.g., low temperature and/or high salt and/or high target concentration) hybrid duplexes 
(e.g., DNA:DNA, RNA:RNA, or RNA:DNA) will form even where the annealed 
sequences are not perfectly complementary. Thus specificity of hybridization is reduced 
at lower stringency. Conversely, at higher stringency (e.g., higher temperature or lower 
salt) successful hybridization requires fewer mismatches. 

20 One of skill in the art will appreciate that hybridization conditions may be 

selected to provide any degree of stringency. In a preferred embodiment, hybridization is 
performed at low stringency to ensure hybridization and then subsequent washes are 
performed at higher stringency to eliminate mismatched hybrid duplexes. Successive 
washes may be performed at increasingly higher stringency (e.g., down to as low as 0.25 

25 X SSPE-T at 37°C to 70°C) until a desired level of hybridization specificity is obtained. 
Stringency can also be increased by addition of agents such as formamide. Hybridization 
specificity may be evaluated by comparison of hybridization to the test probes with 
hybridization to the various controls that can be present. 

In general, there is a tradeoff between hybridization specificity 

30 (stringency) and signal intensity. Thus, in a preferred embodiment, the wash is performed 
at the highest stringency that produces consistent results and that provides a signal 
intensity greater than approximately 10% of the background intensity. Thus, in a 
preferred embodiment, the hybridized array may be washed at successively higher 
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stringency solutions and read between each wash. Analysis of the data sets thus produced 
will reveal a wash stringency above which the hybridization pattern is not appreciably 
altered and which provides adequate signal for the particular probes of interest. 

Methods of optimizing hybridization conditions are well known to those of 
5 skill in the art (see, e.g., Tijssen (1993) Laboratory Techniques in Biochemistry and 
Molecular Biology, Vol. 24: Hybridization With Nucleic Acid Probes, Elsevier, N. Y.). 

Labeling and detection of nucleic acids. 

In a preferred embodiment, the hybridized nucleic acids are detected by 
detecting one or more labels attached to the sample or probe nucleic acids. The labels 

10 may be incorporated by any of a number of means well known to those of skill in the art. 
Means of attaching labels to nucleic acids include, for example nick translation or end- 
labeling (e.g. with a labeled RNA) by kinasing of the nucleic acid and subsequent 
attachment (ligation) of a nucleic acid linker joining the sample nucleic acid to a label 
(e.g., a fluorophore). A wide variety of linkers for the attachment of labels to nucleic 

15 acids are also known. In addition, intercalating dyes and fluorescent nucleotides can also 
be used. 

Detectable labels suitable for use in the present invention include any 
composition detectable by spectroscopic, photochemical, biochemical, immunochemical, 
electrical, optical or chemical means. Useful labels in the present invention include biotin 

20 for staining with labeled streptavidin conjugate, magnetic beads (e.g., Dynabeads™), 
fluorescent dyes (e.g., fluorescein, texas red, rhodamine, green fluorescent protein, and 
the like, see, e.g., Molecular Probes, Eugene, Oregon, USA), radiolabels (e.g., 3 H, 125 I, 
35 S, 14 C, or 32 P), enzymes (e.g., horse radish peroxidase, alkaline phosphatase and others 
commonly used in an ELISA), and colorimetric labels such as colloidal gold (e.g., gold 

25 particles in the 40 -80 nm diameter size range scatter green light with high efficiency) or 
colored glass or plastic (e.g., polystyrene, polypropylene, latex, etc.) beads. Patents 
teaching the use of such labels include U.S. Patent Nos. 3,817,837; 3,850,752; 3,939,350; 
3,996,345; 4,277,437; 4,275,149; and 4,366,241. 

A fluorescent label is preferred because it provides a very strong signal 

30 with low background. It is also optically detectable at high resolution and sensitivity 
through a quick scanning procedure. The nucleic acid samples can all be labeled with a 
single label, e.g., a single fluorescent label. Alternatively, in another embodiment, 
different nucleic acid samples can be simultaneously hybridized where each nucleic acid 
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sample has a different label. For instance, one target could have a green fluorescent label 
and a second target could have a red fluorescent label. The scanning step will distinguish 
cites of binding of the red label from those binding the green fluorescent label Each 
nucleic acid sample (target nucleic acid) can be analyzed independently from one another. 
5 Suitable chromogens which can be employed include those molecules and 

compounds which absorb light in a distinctive range of wavelengths so that a color can be 
observed or, alternatively, which emit light when irradiated with radiation of a particular 
wave length or wave length range, e.g., fluorescers. 

Desirably, fluorescers should absorb light above about 300 run, preferably 

10 about 350 nm, and more preferably above about 400 nm, usually emitting at wavelengths 
greater than about 10 nm higher than the wavelength of the light absorbed. It should be 
noted that the absorption and emission characteristics of the bound dye can differ from 
the unbound dye. Therefore, when referring to the various wavelength ranges and 
characteristics of the dyes, it is intended to indicate the dyes as employed and not the dye 

15 which is unconjugated and characterized in an arbitrary solvent. 

Fluorescers are generally preferred because by irradiating a fluorescer with 
light, one can obtain a plurality of emissions. Thus, a single label can provide for a 
plurality of measurable events. 

Detectable signal can also be provided by chemiluminescent and 

20 bioluminescent sources. Chemiluminescent sources include a compound which becomes 
electronically excited by a chemical reaction and can then emit light which serves as the 
detectable signal or -donates energy to a fluorescent acceptor. Alternatively, luciferins can 
be used in conjunction with luciferase or lucigenins to provide bioluminescence. 
Spin labels are provided by reporter molecules with an unpaired electron spin which can 

25 be detected by electron spin resonance (ESR) spectroscopy. Exemplary spin labels 
include organic free radicals, transitional metal complexes, particularly vanadium, 
copper, iron, and manganese, and the like. Exemplary spin labels include nitroxide free 
radicals. 

The label may be added to the target (sample) nucleic acid(s) prior to, or 
30 after the hybridization. So called "direct labels" are detectable labels that are directly 

attached to or incorporated into the target (sample) nucleic acid prior to hybridization* In 
contrast, so called "indirect labels" are joined to the hybrid duplex after hybridization. 
Often, the indirect label is attached to a binding moiety that has been attached to the 
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target nucleic acid prior to the hybridization. Thus, for example, the target nucleic acid 
may be biotinylated before the hybridization. After hybridization, an avidin-conjugated 
fluorophore will bind the biotin bearing hybrid duplexes providing a label that is easily 
detected. For a detailed review of methods of labeling nucleic acids and detecting labeled 
5 hybridized nucleic acids see Laboratory Techniques in Biochemistry and Molecular 

Biology, Vol 24: Hybridization With Nucleic Acid Probes, P. Tijssen, ed. Elsevier, N.Y., 
(1993)). 

Fluorescent labels are easily added during an in vitro transcription 
reaction. Thus, for example, fluorescein labeled UTP and CTP can be incorporated into 

10 the RNA produced in an in vitro transcription. 

The labels can be attached directly or through a linker moiety. In general, 
the site of label or linker-label attachment is not limited to any specific position. For 
example, a label may be attached to a nucleoside, nucleotide, or analogue thereof at any 
position that does not interfere with detection or hybridization as desired. For example, 

15 certain Label-ON Reagents from Clontech (Palo Alto, CA) provide for labeling 

interspersed throughout the phosphate backbone of an oligonucleotide and for terminal 
labeling at the 3' and 5' ends. As shown for example herein, labels can be attached at 
positions on the ribose ring or the ribose can be modified and even eliminated as desired. 
The base moieties of useful labeling reagents can include those that are naturally 

20 occurring or modified in a manner that does not interfere with the purpose to which they 
are put. Modified bases include but are not limited to 7-deaza A and G, 7-deaza-8-aza A 
and G, and other heterocyclic moieties. 

It will be recognized that fluorescent labels are not to be limited to single 
species organic molecules, but include inorganic molecules, multi-molecular mixtures of 

25 organic and/or inorganic molecules, crystals, heteropolymers, and the like. Thus, for 
example, CdSe-CdS core-shell nanocrystals enclosed in a silica shell can be easily 
derivatized for coupling to a biological molecule (Bruchez et al (1998) Science, 281 : 
2013-2016). Similarly, highly fluorescent quantum dots (zinc sulfide-capped cadmium 
selenide) have been covalently coupled to biomolecules for use in ultrasensitive 

30 biological detection (Warren and Nie (1998) Science, 281: 2016-2018). 

Amplification-based assays. 

In another embodiment, amplification-based assays can be used to detect 
nucleic acids. In such amplification-based assays, the nucleic acid sequences act as a 
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template in an amplification reaction (e.g. Polymerase Chain Reaction (PCR). Detailed 
protocols for quantitative PCR are provided in Innis et al (1990) PCR Protocols, A Guide 
to Methods and Applications, Academic Press, Inc. N.Y.). 

Other suitable amplification methods include, but are not limited to ligase 
5 chain reaction (LCR) (see Wu and Wallace (1989) Genomics 4: 560, Landegren et al. 
(1988) Science 241: 1077, and Barringer et al (1990) Gene 89: 117, transcription 
amplification (Kwoh et al (1989) Proc. Natl Acad. Set USA 86: 1 173), and self- 
sustained sequence replication (Guatelli et al (1990) Proc. Nat. Acad. Set USA 87: 
1874). 

10 Detectior of C. pneumoniae gene expression 

The nucleic acids of the invention can also be used to C. pneumoniae 

detect gene transcripts. Methods of detecting and/or quantifying gene transcripts using 

nucleic acid hybridization techniques are known to those of skill in the art (see Sambrook 

et al supra). For example , a Northern transfer may be used for the detection of the 

1 5 desired mRNA directly. In brief, the mRNA is isolated from a given cell sample using, 
for example, an acid guanidinium-phenol-chloroform extraction method. The mRNA is 
then electrophoresed to separate the mRNA species and the mRNA is transferred from the 
gel to a nitrocellulose membrane. As with the Southern blots, labeled probes are used to 
identify and/or quantify the target mRNA. 

20 In another preferred embodiment, the gene transcript can be measured 

using amplification (e.g. PCR) based methods as described above for directly assessing 
copy number of the target sequences. 

Expression of C. pneumoniae proteins 

The nucleic acids disclosed here can be used for recombinant expression 

25 of the proteins. In these methods, the nucleic acids encoding the proteins of interest are 
introduced into suitable host cells, followed by induction of the cells to produce large 
amounts of the protein. The invention relies on routine techniques in the field of 
recombinant genetics, well known to those of ordinary skill in the art. A basic text 
disclosing the general methods of use in this invention is Sambrook et al., Molecular 

30 Cloning, A Laboratory Manual (2nd ed. 1989). 

Standard transfection methods are used to produce prokaryotic, 
mammalian, yeast or insect cell lines which express large quantities of the desired 
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polypeptide, which is then purified using standard techniques (see, e.g., Colley et al, J. 
Biol Chem. 264:17619-17622, 1989; Guide to Protein Purification, supra). 

The nucleotide sequences used to transfect the host cells can be modified 
to yield Chlamydia polypeptides with a variety of desired properties. For example, the 
5 polypeptides can vary from the naturally-occurring sequence at the primary structure 

level by amino acid, insertions, substitutions, deletions, and the like. These modifications 
can be used in a number of combinations to produce the final modified protein chain. 

The amino acid sequence variants can be prepared with various objectives 
in mind, including facilitating purification and preparation of the recombinant 

1 0 polypeptide. The modified polypeptides are also useful for modifying plasma half life, 
improving therapeutic efficacy, and lessening the severity or occurrence of side effects 
during therapeutic use. The amino acid sequence variants are usually predetermined 
variants not found in nature but exhibit the same immunogenic activity as naturally 
occurring protein. In general, modifications of the sequences encoding the polypeptides 

15 may be readily accomplished by a variety of well-known techniques, such as site-directed 
mutagenesis (see Gillman & Smith, Gene 8:81-97 (1979); Roberts et al, Nature 328:731- 
734 (1987)). One of ordinary skill will appreciate that the effect of many mutations is 
difficult to predict. Thus, most modifications are evaluated by routine screening in a 
suitable assay for the desired characteristic. For instance, the effect of various 

20 modifications on the ability of the polypeptide to elicit a protective immune response can 
be easily determined using in vitro assays. For instance, the polypeptides can be tested 
for their ability to induce lymphoproliferation, T cell cytotoxicity, or cytokine production 
using standard techniques. 

The particular procedure used to introduce the genetic material into the 

25 host cell for expression of the polypeptide is not particularly critical. Any of the well 
known procedures for introducing foreign nucleotide sequences into host cells may be 
used. These include the use of calcium phosphate transfection, spheroplasts, 
electroporation, liposomes, microinjection, plasmid vectors, viral vectors and any of the 
other well known methods for introducing cloned genomic DNA, cDNA, synthetic DNA 

30 or other foreign genetic material into a host cell (see Sambrook et aL, supra). It is only 
necessary that the particular procedure utilized be capable of successfully introducing at 
least one gene into the host cell which is capable of expressing the gene. 



22 



Any of a number of well known cells and cell lines can be used to express 
the polypeptides of the invention. For instance, prokaryotic cells such as E, coli can be 
used. Eukaryotic cells include, yeast, Chinese hamster ovary (CHO) cells, COS cells, and 
insect cells. 

5 The particular vector used to transport the genetic information into the cell 

is also not particularly critical. Any of the conventional vectors used for expression of 
recombinant proteins in prokaryotic and eukaryotic cells may be used. Expression 
vectors for mammalian cells typically contain regulatory elements from eukaryotic 
viruses. 

10 The expression vector typically contains a transcription unit or expression 

cassette that contains all the elements required for the expression of the polypeptide DNA 
in the host cells. A typical expression cassette contains a promoter operably linked to the 
DNA sequence encoding a polypeptide and signals required for efficient polyadenylation 
of the transcript. The term "operably linked" as used herein refers to linkage of a 

15 promoter upstream from a DNA sequence such that the promoter mediates transcription 
of the DNA sequence. The promoter is preferably positioned about the same distance 
from the heterologous transcription start site as it is from the transcription start site in its 
natural setting. As is known in the art, however, some variation in this distance can be 
accommodated without loss of promoter function. 

20 Following the growth of the recombinant cells and expression of the 

polypeptide, the culture medium is harvested for purification of the secreted protein. The 
media are typically clarified by centrifugation or filtration to remove cells and cell debris 
and the proteins are concentrated by adsorption to any suitable resin or by use of 
ammonium sulfate fractionation, polyethylene glycol precipitation, or by ultrafiltration. 

25 Other routine means known in the art may be equally suitable. Further purification of the 
polypeptide can be accomplished by standard techniques, for example, affinity 
chromatography, ion exchange chromatography, sizing chromatography, Hisa tagging and 
Ni-agarose chromatography (as described in Dobeli et aL, Mol. and Biochem. Parasit. 
41 :259-268 (1990)), or other protein purification techniques to obtain homogeneity. The 

30 purified proteins are then used to produce pharmaceutical compositions, as described 
below. 

An alternative method of preparing recombinant polypeptides useful as 
vaccines involves the use of recombinant viruses (e.g., vaccinia). Vaccinia virus is grown 
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in suitable cultured mammalian cells such as the HeLa S3 spinner cells, as described by 
Mackett et aL, in DNA cloning Vol II: A practical approach, pp. 191-211 (Glover, ed.). 

Antibody Production 

The proteins of the present invention can be used to produce antibodies 
5 specifically reactive with C pneumoniae antigens. If isolated proteins are used, they may 
be recombinantly produced or isolated from Chlamydia cultures. Synthetic peptides 
made using the protein sequences may also be used. 

Methods of production of polyclonal antibodies are known to those of skill 
in the art. In brief, an immunogen, preferably a purified protein, is mixed with an 
10 adjuvant and animals are immunized. When appropriately high titers of antibody to the 
immunogen are obtained, blood is collected from the animal and antisera is prepared. 
Further fractionation of the antisera to enrich for antibodies reactive to Chlamydia 
proteins can be done if desired {see Harlow & Lane, Antibodies: A Laboratory Manual 
(1988)). 

15 Polyclonal antisera are used to identify and characterize Chlamydia in the 

tissues of patients using, for instance, in situ techniques and immunoperoxidase test 
procedures described in Anderson et aL JAVMA 198:241 (1991) and Barr et aL Vet. 
Pathol. 28:110-116(1991). 

Monoclonal antibodies may be obtained by various techniques familiar to 

20 those skilled in the art. Briefly, spleen cells from an animal immunized with a desired 
antigen are immortalized, commonly by fusion with a myeloma cell (see Kohler & 
Milstein, Eur. J. Immunol. 6:51 1-519 (1976)). Alternative methods of immortalization 
include transformation with Epstein Barr Virus, oncogenes, or retroviruses, or other 
methods well known in the art. Colonies arising from single immortalized cells are 

25 screened for production of antibodies of the desired specificity and affinity for the 

antigen, and yield of the monoclonal antibodies produced by such cells may be enhanced 
by various techniques, including injection into the peritoneal cavity of a vertebrate host. 

Monoclonal antibodies produced in such a manner are used, for instance, 
in ELIS A diagnostic tests, immunoperoxidase tests, immunohistochemical tests, for the in 

30 vitro evaluation of spirochete invasion, to select candidate antigens for vaccine 

development, protein isolation, and for screening genomic and cDNA libraries to select 
appropriate gene sequences. 
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Immunodiagonostic detection of C pneumoniae infections 

The present invention also provides methods for detecting the presence or 
absence of C. pneumoniae, or antibodies reactive with it, in a biological sample. For 
instance, antibodies specifically reactive with Chlamydia can be detected using either 
5 Chlamydia proteins or the isolates described here. The proteins and isolates can also be 
used to raise specific antibodies (either monoclonal or polyclonal) to detect the antigen in 
a sample. In addition, the nucleic acids disclosed and claimed here can be used to detect 
Chlamydia-spQoifiQ sequences using standard hybridization techniques. 

For a review of immunological and immunoassay procedures in general, 

10 see Basic and Clinical Immunology (Stites & Terr ed., 7th ed. 1991)). The immunoassays 
of the present invention can be performed in any of several configurations, which are 
reviewed extensively in Enzyme Immunoassay (Maggio, ed., 1980); Tijssen, Laboratory 
Techniques in Biochem.stry and Molecular Biology (1985)). For instance, the proteins 
and antibodies disclose 1 here are conveniently used in ELISA, immunoblot analysis and 

15 agglutination assays. 

In brief, immunoassays to measure dxilx-Chlamydia antibodies or antigens 
can be either competitive or noncompetitive binding assays. In competitive binding 
assays, the sample analyte (e.g., anti-Chlamydia antibodies) competes with a labeled 
analyte (e.g., mti-Chlamydia monoclonal antibody) for specific binding sites on a capture 

20 agent (e.g., isolated Chlamydia protein) bound to a solid surface. The concentration of 
labeled analyte bound to the capture agent is inversely proportional to the amount of free 
analyte present in the sample. 

Noncompetitive assays are typically sandwich assays, in which the sample 
analyte is bound between two analyte-specific binding reagents. One of the binding 

25 agents is used as a capture agent and is bound to a solid surface. The second binding 
agent is labelled and is used to measure or detect the resultant complex by visual or 
instrument means. 

A number of combinations of capture agent and labelled binding agent can 
be used. For instance, an isolated Chlamydia protein or culture can be used as the 
30 capture agent and labelled anti-human antibodies specific for the constant region of 

human antibodies can be used as the labelled binding agent. Goat, sheep and other non- 
Luman antibodies specific for human immunoglobulin constant regions (e.g., y or ji) are 
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well known in the art. Alternatively, the anti-human antibodies can be the capture agent 
and the antigen can be labelled. 

Various components of the assay, including the antigen, anti-Chlamydia 
antibody, or anti-human antibody, may be bound to a solid surface. Many methods for 
5 immobilizing biomolecules to a variety of solid surfaces are known in the art. For 
instance, the solid surface may be a membrane (e.g., nitrocellulose), a microtiter dish 
(e.g., PVC or polystyrene) or a bead. The desired component may be covalently bound or 
noncovalently attached through nonspecific bonding. 

Alternatively, the immunoassay may be carried out in liquid phase and a 
10 variety of separation methods may be employed to separate the bound labeled component 
from the unbound labelled components. These methods are known to those of skill in the 
art and include immunoprecipitation, column chromatography, adsorption, addition of 
magnetizable particles coated with a binding agent and other similar procedures. 

An immunoassay may also be carried out in liquid phase without a 
15 separation procedure. Various homogeneous immunoassay methods are now being 
applied to immunoassays for protein analytes. In these methods, the binding of the 
binding agent to the analyte causes a change in the signal emitted by the label, so that 
binding may be measured without separating the bound from the unbound labelled 
component. 

20 Western blot (immunoblot) analysis can also be used to detect the presence 

of antibodies to Chlamydia in the sample. This technique is a reliable method for 
confirming the presence of antibodies against a particular protein in the sample. The 
technique generally comprises separating proteins by gel electrophoresis on the basis of 
molecular weight, transferring the separated proteins to a suitable solid support, (such as a 

25 nitrocellulose filter, a nylon filter, or derivatized nylon filter), and incubating the sample 
with the separated proteins. This causes specific target antibodies present in the sample 
to bind their respective proteins. Target antibodies are then detected using labeled anti- 
human antibodies. 

The immunoassay formats described above employ labelled assay 

30 components. The label may be coupled directly or indirectly to the desired component of 
the assay according to methods well known in the art. A wide variety of labels may be 
used. The component may be labelled by any one of several methods. Traditionally a 
radioactive label incorporating 3 H, 125 1, 35 S, 14 C, or 32 P was used. Non-radioactive labels 
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include ligands which bind to labelled antibodies, fluorophores, chemiluminescent agents, 
enzymes, and antibodies which can serve as specific binding pair members for a labelled 
ligand. The choice of label depends on sensitivity required, ease of conjugation with the 
compound, stability requirements, and available instrumentation. 
5 Enzymes of interest as labels will primarily be hydrolases, particularly 

phosphatases, esterases and glycosidases, or oxidoreductases, particularly peroxidases. 
Fluorescent compounds include fluorescein and its derivatives, rhodamine and its . 
derivatives, dansyl, umbelliferone, etc. Chemiluminescent compounds include luciferin, 
and 2,3-dihydrophthalazinediones, e.g., luminol. For a review of various labelling or 

10 signal producing systems which may be used, see U.S. Patent No. 4,391,904, which is 
incorporated herein by reference. 

Non-radioactive labels are often attached by indirect means. Generally, a 
ligand molecule (e.g., biotin) is covalently bound to the molecule. The ligand then binds 
to an anti-ligand (e.g., streptavidin) molecule which is either inherently detectable or 

15 covalently bound to a signal system, such as a detectable enzyme, a fluorescent 

compound, or a chemiluminescent compound. A number of ligands and anti-ligands can 
be used. Where a ligand has a natural anti-ligand, for example, biotin, thyroxine, and 
Cortisol, it can be used in conjunction with the labelled, naturally occurring anti-ligands. 
Alternatively, any haptenic or antigenic compound can be used in combination with an 

20 antibody. 

Some assay formats do not require the use of labelled components. For 
instance, agglutination assays can be used to detect the presence of the target antibodies. 
In this case, antigen-coated particles are agglutinated by samples comprising the target 
antibodies. In this format, none of the components need be labelled and the presence of 
25 the target antibody is detected by simple visual inspection. 

Pharmaceutical Compositions 

The peptides or antibodies (typically monoclonal antibodies) of the present 
invention and pharmaceutical compositions thereof are useful for administration to 
mammals, particularly humans, to treat and/or prevent Chlamydia infections. Suitable 
30 formulations are found in Remington's Pharmaceutical Sciences, Mack Publishing 
Company, Philadelphia, PA, 17th ed. (1985). 
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The immunogenic peptides or antibodies of the invention are administered 
prophylactically or to an individual already suffering from the disease. The peptide 
compositions are administered to a patient in an amount sufficient to elicit an effective 
immune response to Chlamydia. An effective immune response is one that inhibits 
5 infection. An amount adequate to accomplish this is defined as "therapeutically effective 
dose" or "immunogenically effective dose. " Amounts effective for this use will depend 
on, e.g., the peptide composition, the manner of administration, the stage and severity of 
the disease being treated, the weight and general state of health of the patient, and the 
judgment of the prescribing physician, but generally range for the initial immunization 

10 (that is for therapeutic or prophylactic administration) from about 0.1 mg to about 1.0 mg 
per 70 kilogram patient, more commonly from about 0.5 mg to about 0.75 mg per 70 kg 
of body weight. Boosting dosages are typically from about 0.1 mg to about 0,5 mg of 
peptide using a boosting regimen over weeks to months depending upon the patient's 
response and condition. A suitable protocol would include injection at time 0, 4, 2, 6, 10 

1 5 and 14 weeks, followed by further booster injections at 24 and 28 weeks. 

For therapeutic use, administration should begin at the first sign of 
infection. This is followed by boosting doses until at least symptoms are substantially 
abated and for a period thereafter. In some circumstances, loading doses followed by 
boosting doses may be required. The resulting immune response helps to cure or at least 

20 partially arrest symptoms and/or complications. Vaccine compositions containing the 
peptides are administered prophylactically to a patient susceptible to or otherwise at risk 
of the infection. 

The pharmaceutical compositions (containing either peptides or 
antibodies) are intended for parenteral or oral administration. Preferably, the 

25 pharmaceutical compositions are administered parenterally, e.g., subcutaneously, 
intradermally, or intramuscularly. Thus, the invention provides compositions for 
parenteral administration which comprise a solution of the immunogenic polypeptides 
dissolved or suspended in an acceptable carrier, preferably an aqueous carrier. A variety 
of aqueous carriers may be used, e.g., water, buffered water, 0.4% saline, 0.3% glycine, 

30 hyaluronic acid and the like. These compositions may be sterilized by conventional, well 
known sterilization techniques, or may be sterile filtered. The resulting aqueous solutions 
may be packaged for use as is, or lyophilized, the lyophilized preparation being combined 
with a sterile solution prior to administration. The compositions may contain 
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pharmaceutically acceptable auxiliary substances as required to approximate 
physiological conditions, such as buffering agents, tonicity adjusting agents, wetting 
agents and the like, for example, sodium acetate, sodium lactate, sodium chloride, 
potassium chloride, calcium chloride, sorbitan monolaurate, triethanolamine oleate, etc. 
5 The compositions may also comprise carriers to enhance the immune 

response. Useful carriers are well known in the art, and include, e.g., KLH, 
thyroglobulin, albumins such as human serum albumin, tetanus toxoid, polyamino acids 
such as poly(lysine:glutamic acid), influenza, hepatitis B virus core protein, hepatitis B 
virus recombinant vaccine and the like. 

10 For solid compositions, conventional nontoxic solid carriers may be used 

which include, for example, pharmaceutical grades of mannitol, lactose, starch, 
magnesium stearate, sodium saccharin, talcum, cellulose, glucose, sucrose, magnesium 
carbonate, and the like. For oral administration, a pharmaceutically acceptable nontoxic 
composition is formed t y incorporating any of the normally employed excipients, such as 

15 those carriers previously listed, and generally 10-95% of active ingredient, that is, one or 
more peptides of the invention, and more preferably at a concentration of 25%-75%. 

As noted above, the peptide compositions are intended to induce an 
immune response to Chlamydia. Thus, compositions and methods of administration 
suitable for maximizing the immune response are preferred. For instance, peptides may 

20 be introduced into a host, including humans, linked to a carrier or as a homopolymer or 
heteropolymer of active peptide units from various Chlamydia proteins disclosed here. 
Alternatively, a "cocktail" of polypeptides can be used. A mixture of more than one 
polypeptide has the advantage of increased immunological reaction and, where different 
peptides are used to make up the polymer, the additional ability to induce antibodies to a 

25 number of epitopes. 

The compositions also include an adjuvant. As used here, number of 
adjuvants are well known to one skilled in the art. Suitable adjuvants include incomplete 
Freund's adjuvant, alum, aluminum phosphate, aluminum hydroxide, 
N-acetyl-muramyl-L-threonyl-D-isoglutamine (thr-MDP), 

30 N-acetyl-nor-muramyl-L-alanyl-D-isoglutamine (CGP 1 1637, referred to as nor-MDP), 
N-acetylmuramyl-Lalanyl-D-isoglutaminyl-L-alanine-2-(r-2'-dipalmitoyl-sn- 
glycero-3-hydroxyphosphoryloxy)-ethylamine (CGP 1983 5 A, referred to as MTP-PE), 
and RIBI, which contains three components extracted from bacteria, monophosphoryl 
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lipid A, trehalose dimycolate and cell wall skeleton (MPL+TDM+CWS) in a 2% 
squalene/Tween 80 emulsion. The effectiveness of an adjuvant may be determined by 
measuring the amount of antibodies directed against the immunogenic peptide. 

The concentration of immunogenic peptides of the invention in the 
5 pharmaceutical formulations can vary widely, i.e. from less than about 0.1%, usually at 
or at least about 2% to as much as 20% to 50% or more by weight, and will be selected 
primarily by fluid volumes, viscosities, etc., in accordance with the particular mode of 
administration selected. 

The peptides of the invention can also be expressed by attenuated viral 

10 hosts, such as vaccinia or fowlpox. This approach involves the use of vaccinia virus as a 
vector to express nucleotide sequences that encode the peptides of the invention. Upon 
introduction into a host, the recombinant vaccinia virus expresses the immunogenic 
peptide, and thereby elicits an immune response. Vaccinia vectors and methods useful in 
immunization protocols are described in, e.g., U.S. Patent No. 4,722,848. Another vector 

15 is BCG (Bacille Calmette Guerin). BCG vectors are described in Stover et al. {Nature 
351:456-460 (1991)). A wide variety of other vectors useful for therapeutic 
administration or immunization of the peptides of the invention, e.g., Salmonella typhi 
vectors and the like, will be apparent to those skilled in the art from the description 
herein. 

20 The DNA encoding one or more of the peptides of the invention can also 

be administered to the patient. This approach is described, for instance, in Wolff eL al., 
Science 247: 1465-1468 (1990) as well as U.S. Patent Nos. 5,580,859 and 5,589,466. 

In order to enhance serum half-life, the peptides may also be encapsulated, 
introduced into the lumen of liposomes, prepared as a colloid, or other conventional 

25 techniques may be employed which provide an extended serum half-life of the peptides. 
A variety of methods are available for preparing liposomes, as described in, e.g., Szoka et 
al., Ann. Rev. Biophys. Bioeng. 9:467 (1980), U.S. Pat. Nos. 4, 235,871, 4,501,728 and 
4,837,028. 
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The following examples are offered to illustrate, but no to limit the 
claimed invention. 
Example 1 : 
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This example describes comparison of the C. pneumoniae genome 
disclosed here and the, previously sequenced, C trachomatis genome (Stephens, et al 
Science 282:754-759 (1998)). 

The apparent low level of DNA homology between C trachomatis and C 
5 pneumoniae (Campbell, et al.,J. Clin. Microbiol. 25:191 1-1916 (1987)) yet analogous 
cell structures and developmental cycles, predicts that comparative analysis of the two 
genomes will significantly enhance the understanding of both pathogens. Identification 
of genes that are present in one species but not the other are of particular importance for 
the mutually exclusive biological, virulence and pathogenesis capabilities of each. 

10 Identification of genes shared between the two species strongly supports the requirement 
for these capabilities in a biological system that has, over its long-term association with 
mammalian host cells, evolved to reduce the metabolic capacities while optimizing 
survival, growth and transmission of these unique pathogens. 

The previously sequenced C. trachomatis genome contains 1,042,519 

15 nucleotides and 875 likely protein-coding genes. Similarity searching permitted the 
inferred functional assignment of sequences 636 (60%) genes disclosed' here and 251 
(23%) are similar to hypothetical genes for other bacterial organisms including those for 
C trachomatis. The remaining 186 (17%) genes are not homologous to sequences 
deposited in GenBank.. Seventy C trachomatis genes are not represented in the C. 

20 pneumoniae genome. These are contained within blocks consisting of 2-17 genes and 19 
single genes. Of the 70 C. trachomatis genes without homologs in C. pneumoniae, 60 are 
classified as encoding hypothetical proteins. The remaining genes not represented in C. 
pneumoniae consist of the tryptophan operon {trpA,B f R\ trpC, two predicted thiol 
protease genes, and 4 genes assigned to the phospholipase-D superfamily. 

25 It is evident that there is a high level of functional conservation between C. 

pneumoniae and C trachomatis as orthologs to C. trachomatis genes were identified for 
859 (80%) of the predicted coding sequences for C pneumoniae. The level of similarity 
for individual encoded proteins spans a wide spectrum (22-95% amino acid identity) with 
an average of 62% amino acid identity between orthologs from the two species. The 

30 percent amino acid identity between orthologous chlamydial proteins is similar among 
functional groups with the highest for proteins associated with translation and the lowest 
for proteins whose function in chlamydiae is uncharacterized and not related to proteins 
encoded by other organisms. The gene order of the homologous set of genes in C 
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pneumoniae shows reorganization relative to the genome of C trachomatis', however, 
there is a high level of synteny for the gene organization of the two genomes. We 
identified thirty-nine blocks of 2 or more genes whose gene organization is colinear with 
homologs to C. trachomatis, although some of these are inverted. The distribution of 
5 genome reorganization is not evenly distributed on the chromosome as the region 
between C. pneumoniae coding sequences 0130-0300 contains substantially more 
reorganization than other areas of the genome. This region coincides with the predicted 
chromosome replication terminus. 

We identified orthologs of enzymes characterized in other bacteria that 

10 account for the essential requirements for DNA replication, repair, transcription and 

translation including two predicted DNA helicases of the Swi2/Snf2 family found in C. 
trachomatis. Similar to C trachomatis, alternative sigma subunits for RNA polymerase, 
a 28 and a 54 , were identified in addition to anti-a regulatory system factors RsbV, a 
RsbW-like single-domain histidine kinase, and a RsbU-like protein phosphatase. These 

15 findings suggest that the fundamental mechanisms of transcriptional regulation are 

conserved among Chlamydia. The C. trachomatis proteins containing SET and SWIB 
domains, and a SWIB domain fused to the C-terminus of the chlamydial topoisomerase I, 
not identified outside eukaryotes, are found in C pneumoniae supporting their possible 
role in the chromatin condensation-decondensation characteristic of the biologically 

20 unique chlamydial developmental cycle. 

The central metabolic pathways inferred from the C. pneumoniae genome 
sequence are the same as those identified for C. trachomatis C. pneumoniae has a 
glycolytic pathway and a linked tricarboxylic acid cycle, although likely functional, is 
incomplete as genes for citrate synthase, aconitase, and isocitrate dehydrogenase were not 

25 identified. C pneumoniae has a complete glycogen synthesis and degradation system 
supporting a role for glycogen synthesis and utilization of glucose-derivatives in 
chlamydial metabolism. Genes encoding essential functions in aerobic respiration are 
present and electron flux may be supported by pyruvate, succinate, glycerol-3-phosphate, 
and NADH dehydrogenases, NADH-ubiquinone oxidoreductase and cytochrome oxidase. 

30 C. pneumoniae also contains the V (vacuolar)-type ATPase operon and the two ATP 
translocases found in C. trachomatis. 

The type-Ill secretion virulence system required for invasion by several 
pathogenic bacteria and found in the C. trachomatis genome in three chromosomal 
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locationsis also present in the C pneumoniae genome. Each of the components is 
conserved and their relative genomic contexts are conserved. Genes such as a predicted 
serine/threonine protein kinase and other genes physically linked to genes encoding 
structural components of the type-Ill secretion apparatus, but without identified 
5 homologs, are also highly similar between the two species suggesting the functional roles 
in modifying cellular biology are fundamentally conserved. 

Chiamydia-encoded proteins that are not found in chlamydial organisms 
but localized to the intracellular chlamydial inclusion membrane are likely essential for 
the unique intracellular biology and perhaps differences in inclusion morphology 
10 observed between species of Chlamydia. Several such proteins, termed IncA,B&C, have 

been characterized for a C. psittaci strain (Rockey, et al Mol Microbiol. 15:617-626 
(1995); Rockey et al Infect. Immun. 62:106-112 (1994)). C pneumoniae and C. 
trachomatis encode orthologs to C. psittaci IncB and IncC and C. trachomatis also 
contains an ortholog to LicA. C pneumoniae contains two genes that encode proteins 

15 with similarity to IncA (CPn0186 and CPn0585), although the level of homology is low 
suggesting analogous but possibily altered functions. 

The tryptophan biosynthesis operon (trpA, trpB, trpR) and trpC identified 
in C trachomatis is conspicuously missing in the C. pneumoniae genome. This 
represents the entire repertoire of genes associated with tryptophan biosynthesis identified 

20 in C trachomatis. Seventeen genes adjacent to the C trachomatis tryptophan operon also 
were not found in the C pneumoniae genome. This region is the single largest loss of a 
contiguous genomic segment and includes 4 HKD superfamily encoding genes that 
encompass a family of proteins related to endonuclease and phospholipase D. These 
findings may be important for the ability of Chlamydia to persist in their hosts and cause 

25 disease by eliciting potent, focal and persistent inflammatory responses thought to be 
essential for pathogenesis. 

The G pneumoniae genome contains 187,71 1 additional nucleotides 
compared to the G trachomatis genome, and the 214 coding sequences not found in C. 
trachomatis account for most of the increased genome size. Eighty-eight of these genes 

30 are found in blocks of >10 genes (1 1-30 genes/block), 41 are single genes, and the 

remainder are partnered with at least one other gene. Based upon the observation that 
-70% of all the C pneumoniae genes have an identifiable homolog in GenBank, 
exclusive of C trachomatis, it would be expected that over 150 of the 214 genes should 
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have a homolog in GenBank, many associated with a function. However, only 28 coding 
sequences have similarity to genes from other organisms. Thus the majority of the genes 
that are mutually exclusive of C. trachomatis (186 of 214), and the 60 of 70 C 
trachomatis genes that lacked an identifiable homolog in C. pneumoniae, do not have 
5 detectable homo logs to genes from other organisms. We predict that most of the unique 
genes are essential for specific attributes that define the differential biology, tropism and 
pathogenesis of C trachomatis and C pneumoniae. Moreover, this suggests that C. 
pneumoniae has more unique biological (i.e., virulence) capacity than C. trachomatis. 
The ability of C pneumoniae to be more invasive and survive in a broader range of host 

10 cell types than C trachomatis is consistent with this hypothesis. Not all of the 

differences in biological capacity may be associated with mutually exclusive genes. One 
explanation for the significantly lower level of homology between protein sequences 
assigned as having C. pneumoniae and C trachomatis orthologs but no identifiable 
orthologs in other organisms is that this set of proteins is not only associated with 

15 biological requirements specific for Chlamydia but this polymorphism may account for 
differential biology between the two species. The determination of the genome sequence 
from a representative of the C psittaci group will precisely delineate those genes that are 
mutually exclusive and specific for each species. 

The major functionally identifiable addition to the C. pneumoniae genome 

20 is a large expansion of genes encoding a new family of chlamydial polymorphic 

membrane proteins (Pmp), alone representing 22% of the increased coding capacity. 
While the C trachomatis genome has 9 pmp genes, remarkably the C. pneumoniae 
genome contains 21 pmp genes. Most of these genes appear to be amplified in two 
regions of the genome with three stand-alone genes. Interestingly one of the stand-alone 

25 genes is most closely related to the C. trachomatis pmpD which is the only stand-alone 
pmp gene in the C trachomatis genome and it is located with the same relative genomic 
context, suggesting an essential and conserved function for this paralog. Six Pmp-coding 
genes are presumably not functional as five contain predicted coding frame-shifts and one 
is truncated. The amplification of this gene family and the confidently predicted frame- 

30 shifts suggest a specific molecular mechanism to promote functional or antigenic 

diversity. The biological role of this protein family remains enigmatic, although at least 
one of the proteins in C psittaci related to this family is exposed on the chlamydial 
surface. 



While a function could not be assigned for most of the unique C 
pneumoniae genes, several have significant similarity to genes from other organisms. 
Functional assignments could be made for genes encoding GMP synthetase, IMP 
dehydrogenase, UMP synthase, uridine kinase, biotin synthase pathway proteins, 
5 methylthioadenosine nucleosidase, a DNA glycosylase and aromatic amino acid 
hydroxylase. Thus a complete pathway was identified for biotin biosynthesis. The 
additional purine and pyrimidine salvage pathway genes presumably reflect metabolic 
limitations in one of the cell types that C. pneumoniae infects or differences in the ability 
of C pneumoniae to transport precursor nucleosides or nucleotides. 

10 The addition of aromatic amino acid hydroxylase in C. pneumoniae is 

intriguing especially in light of the loss of tryptophan biosynthetic genes and the inability 
to synthesize other amino acids including phenylalanine. Aromatic amino acid 
hyroxlyases include three distinct enzymes that function to receptively oxidize 
phenylalanine to tyrosine, tyrosine to Dopa, and tryptophan to 5-hydroxytryptophan and 

15 serotonin. Although the chlamydial protein is similar to proteins of this family and 

incrementally more closely related to tryptophan hydroxylase, its specific function could 
not be confidently predicted. We hypothesize that it may be involved in C pneumoniae 
virulence. Tryptophan hydroxylase has not been previously identified in bacteria and the 
origin of the chlamydial gene appears to be from eukaryotes. The functional role of an 

20 aromatic amino acid hydroxylase for C pneumoniae is linked to the unique intracellular 
biology of this organism and may represent a key contribution to C. pneumoniae 
persistence and pathogenesis. 

It is understood that the examples and embodiments described herein are 
for illustrative purposes only and that various modifications or changes in light thereof 

25 will be suggested to persons skilled in the art and are to be included within the spirit and 
purview of this application and scope of the appended claims. All publications, patents, 
and patent applications cited herein are hereby incorporated by reference in their entirety 
for all purposes. 

Table 1 provides functional assignments of C. pneumoniae nonprotein- 
30 encoding genomic sequences. Table 2 provides functional assignments of protein coding 
sequences. Table 3 provides the amino acid sequences of the proteins corresponding to 
the coding sequences. 
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CPn0 06 3 


78112 


78.167 


F 


CPaO0^4 


78346 


73576 


F 


TPn006 5 


7 H ^ 2 4 


80651 


F 


CPn0066 


H09J5 


82655 


F 



n - nfl fnnrrlon <C . eraghnrn*|M« nrrhaioa in Par«£Ith«««l 1 



CT001 hypothetical protein 

gatC-Glu-tRNA Gin Amidotrans f erase (C subunit) - {CT002 > 
qatA-Glu CRN A Gin Amidotrans f erae- (CT003 ) 

./-itB * { Pet U2) <U*i iMTiA Gin Am i<iot ran- :-*ras-s (8 :;ubun it > - (OTOO-l 1 
prnp_i -Polymorphic Outer Membrane Protein G Family 



frame-shift with 0010 

pmp_2- Polymorphic Outer Membrane Protein G Family 

pntp_3 -Polymorphic Outer Membrane Protein G Family 

pmp_3-?MP_3 (frame-shift with 0014) 

pmp_4- Polymorphic Outer Membrane Protein G Family 

pmp_4-PMP_4 {frame -shift with 0016) 

pmp_5- Polymorphic Outer Membrane Protein G Family 

pmp_5-PMP_5 {frame-shift with 0018) 

Predicted OMP [leader (14) peptide: outer membrane ]-( CT3 5 1 ) 

Predicted OMP (leader {19) peptide] - (CT3 50 ) 

maf-(CT349) 

yjjK/alr-ABC Transporter Protein ATPase- (CT348) 
xerC-Integrase/recombinase- (CT347) 
elaC/atsA-Sulphohydrolase/Glycosulfatase- { CT346) 
CT345 hypothetical protein- (CT345) 
lon-Lon ATP-dependent Protease- (CT344) 



gcp_l-0-Sialoglycopratein Endopeptidase_l- (CT343) 
rs21-S21 Ribosomal Protein- (CT342 ) 
dnaJ-Heat Shock Protein J-(CT341) 

pdhAS^B/odbA&odbB- (pyruvate) Oxo i s oval era te Dehydrogenase Alpha fc Beta 
Fusion- (CT340) 

CT339 hypothetical protein 
CT338 hypothetical protein 

ptsH-PTS Phosphocarrier Protein Hpr-{CT3 37) ^ ■ 

ptsI-PTS PEP Phosphotransferase- (CT3 3 6) 
ybaB-(CT335i ■ 

dnaX_l-DNA Pol III Gamma and Tau_l- (CT3 34 ) 



•yqfF-Bs conserved hypothetical IM protein 



heme - Po rphob i 1 inogen Deaminas e - ( CT 2 9 9 ) 
sms-Sms Protein- (CT298 ) 
rnc-Ribonuclease III-{CT297) 
CT296 hypothetical protein 
mrsA-Phosphomannomutase- (CT295) 
sodM-Superoxide Oismutase (MnJ-(CT294) 
accD-AcCoA Carboxylase/Transferase Beta-(CT293) 
dut-dUTP Nucleotidohydroiase- {CT2 92) 
ptsN_L-PTS tIA Protein- (CT29 Li 

ptsN_2-PTS IIA Protein ► HTH DNA-Binding Domain- {CT290 ) 
CT239 hypothetical protein 



CT288 hypothetical protein 
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CPn0067 


82953 


84053 


F 


CPn0068 


84903 


84331 


R 


CPn0O69 


85236 


87086 


F 


CPn0070 


87378 


87208 


R 


CPn0071 


89045 


87599 


R 


CPn0072 


89061 


88057 


R 


CPn0073 ■ 


89356 


89574 


F 


CPn0074 


89774 


90955 


F 


CPn0075 


91102 


91350 


F 


CPn0076 


91358 


91903 


F 


CPn0077 


92013 


92435 


F 


CPn0078 


92465 


93160 


F 


CPn0079 


93179 


93688 


F 


CPn0080 


93735 


94121 


F 


CPn0081 


94261 


98016 


F 


CPn0082 


98043 


102221 


F 


CPn0083 


102332 


103312 


F 


CPn0084 


103362 


103751 


F 


CPn0085 


104506 


103766 


R 


CPn0086 


104904 


105527 


F 


CPn0087 


105579 


106376 


F 


CPn0088 


106373 


108145 


F 


C^gp089 


108153 


109466 


F 


CP/5jp09O 


109454 


110080 


F 


CPpb091 


110074 


112053 


F 


CPht)092 


112151 


112573 


F 


cMibo93 


112509 


113015 


F 


Cfip094 


113152 


115971 




C$£o095 


116037 


118790 


F 


cfi.0096 


124314 


118837 


R 


CP&0097 


124555 


126006 


F 


Cfllb098 


127491 


126091 


R 


CPn0099 


127593 


127865 


F 


CfnOlOO 


129141 


127882 


R 


CPnOlOl 


129932 


129141 


R 


CPn0102 


130123 


131466 


F 


cih*oio3 


131480 


132511 


F 


cfapi04 


133875 


132676 


R 




134847 


134029 


R 


C#gbl06 


135091 


136374 


F 


CPM)107 


137162 


136392 


R 


CPn0108 


137857 


137303 


R 


CPn0109 


138655 


141783 


F 


CPnOllO 


143734 


141827 


R 


CPnOlll 


144686 


143934 


R 


CPn0112 


144767 


145093 


F 


CPn0113 


145335 


146405 


F 


CPn0114 


146398 


147261 


F 


CPnOllS 


147279 


148622 


F 


CPnOH*6 


148616 


148972 


F 


CPn0117 


148989 


150071 


F 


CPn0118 


150102 


150464 


• F 


CPn0119 


150523 


151164 


F 


CPn0120 


151164 


151778 


F 


CPn0121 


151778 


152068 


F 


CPn0122 


152071 


153723 


F 


CPn0123 


155969 


153774 


R 


CPn0124 


156614 


158068 


F 


CPn0125 


158096 


158605 


F 


CPn0126 


158809 


161085 


F 


CPn0127 


162143 


161130 


R 


CPn0128 


162277 


163053 


F 


CPn0129 


163717 


163064 


R 


CPn0130 


164245 


163751 


R 


CPn0131 


164549 


165580 


F 


CPn0132 


165587 


166561 


F 


CPn0133 


167334 


166564 


R 


CPn0134 


169098 


167467 


R 


CPn0135 


169448 


169143 


R 


CPn0136 


171401 


169569 


R 


C?n013 7 


172254 


171502 


R 


CPn0l38 


174019 


172700 


R 



CT3 60 hypothetical protein 



CT325 hypothetical protein 

CT324 hypothetical protein 

mf A-Initiation Factor IF-1-(CT323) 

tuf A-Elongation Factor Tu-{CT322) 

secE-preprotein translocase- {CT321 ) 

nusG-Transcriptional Ant i termination- (CT320) 

rlll-Lll Ribosomal Protein- (CT319) 

rll-Ll Ribosomal Protein- {CT318 } 

rllO-LlO Ribosomal Protein- {CT3 17 ) 

r!7-L7/L12 Ribosomal Protein- (CT3 16) 

rpoB-RNA Polymerase Beta-(CT315) 

rpoC-RNA Polymerase Beta' -(CT314) 

tal-Transaldolase- (CT313) 

predicted f erredoxin- (CT312 ) 

CT311 hypothetical protein 

atpE-ATP Synthase Sub unit E-{CT310) 

CT3 09 hypothetical protein 

atpA-ATP Synthase Sub unit A-{CT3 08) 

atpB-ATP Synthase Sub unit B-{CT3 07) 

atpD-ATP Synthase Sub unit D-(CT3 06) 

atpI-ATP Synthase Sub unit I-(CT305) 

atpK-ATP Synthase Sub unit K-(CT3 04) 

CT3 03 hypothetical protein 

valS-Valyl tRNA Synthetase- (CT3 02 ) 

pknD-S/T Protein Kinase- {CT3 01 ) 

uvrA-Exci nuclease ABC Subunit A-(CT333) 

pyk- Pyruvate Kinase- {CT3 32 } 

htrB-Acyl trans f erase- (CT010) 

CT011 hypothetical protein 

ybbP family hypothetical protein- {CT012 ) 

cydA-Cy to chrome Oxidase Subunit I- (CT013 ) 

cydS -Cytochrome Oxidase Subunit II-{CT014} 

CT017 hypothetical protein 

CT016 hypothetical protein 

phoH-ATPase- (CT015) 

CT058 hypothetical protein_l 

CT018 

ileS-Isoleucyl-tRNA Synthetase- (CT019) 
lepB-Signal Peptidase I-(CT020) 
CT021 hypothetical protein 
rl31-L31 Ribosomal Protein- (CT022 ) 

pfrA-Peptide Chain Releasing Factor (RF-1) - (CT023 ) 

hemK-A/G specific methylase- {CT024 ) 

ffh-Signal Recognition Particle GTPase- (CT025 ) 

rsl6-S!6 Ribosomal Protein- (CT026 ) 

trmD-tRNA (guanine N-l ) -Methyl trans f erase- (CTO 27 ) 

rll9-L19 Ribosomal Protein- {CT028 } 

rnhB_l-Ribonuclease KII„1- {CT029 ) 

gmk-GMP Kinase- (CT030 ) 

CT031 hypothetical protein 

metG-Methionyl-tRNA Synthetase- (CT032) 

recD_l-Exodeoxyribonuclease V (Alpha Subunit ) _1- (CT03 3 ) 



ytfF-Cationic Amino Acid Transporter- (CT03 4 ) 
bpll-Biotin Protein Ligase- (CT035 } 
similarity to CT036 



CHLPS hypothetical protein- (CT109 ) 
groEL_l-HSP-60_l- (CT110) 
groES-lOKDa Chaperonin- { CT111 ) 
pepF-Oligopeptidase- (CT112) 
ybgl-ACR f ami ly- (CT108 ) 

hemL-Glutamate-l-semialdehyde-2, 1 -aminornutase- (CT210) 
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CPn0139 


174656 


174093 


R 


yqgE- (CT210) 


CPn0140 


175110 


174673 


R 


yqdE-(CT212) 


CPn0141 


175802 


175110 


R 


rpiA-Ribose-5-F Isomerase A-(CT213) 


CPn0142 


176091 


175816 


R 




CPn0143 


177335 


176214 


R 


*yxjG_Bs_l Hypothetical Protein 


CPn0144 


177953 


180560 


F 


clpB-Clp Protease ATPase- {CT113 ) 


CPn0145 


180777 


182369 


F 


CT114 hypothetical protein 


CPn0146 


182613 


183095 


F 




CPn0147 


183225 


183671 


F 




CPn0148 


183846 


185702 


F 


pknl-S/T Protein Kinase- (CT14 5 ) 


CPn0149 


185715 


187700 


F 


dnlJ-DNA Ligase- (CT146 ) 


CPnOlSO 


187834 


192444 


F 


CT147 hypothetical protein 


CPnOlSl 


194142 


192625 


R 


mhpA-Mo no oxygenase- (CT148) 


CPn0152 


195265 


194318 


R 


CT149 hypothetical protein 


lfpn0153 


195433 


197892 


F 


leuS-Leucyl tRNA Synthetase- (CT209 ) 


CPn0154 


197892 


199202 


F 


gseA-KDO Transferase- (CT208 J 


CPn0155 


199691 


199488 


R 




CPn0156 


200117 


199770 


R 




CPn0l57 


200723 


200298 


R 




CPn0158 


201430 


200894 


R 




CPn0159 


201772 


201467 


R 




CPn0160 


203791 


202127 


R 


p f kA_l - Fructose- 6 - P Phosphotrans f erase_l - X CT2 07 ) 


CPis0f.61 


204622 


203798 


R 


predicted acyltransf erase f amily- (CT206 ) 


CPnS-162 


205828 


204803 


R 




CPligL63 


206026 


206394 


F 




CPUH64 


206498 


206998 


F 




CP4©|L65 


206998 


207582 


F 




cpiffiee 


207630 


207962 


F 




CPrgi67 


208306 


207977 


R 




CPr|EL68 


208641 


208417 


R 




CPrlSi69 


209501 


208710 


R 




C Prill 7 0 


211026 


210025 


R 




CPn£171 


212435 


211149 


R 


*guaA-GMP Synthase 


CPriai7 2 


213177 


212440 


R 


*guaB/impD-Inosine 5 ' -monophosphase dehydrogenase (C00H- 










only) 


CPrltfl73 


213987 


213715 


R 




CP4Q174 


214257 


214724 


F 




CPr|Q^L7 5 


214898 


215275 


F' 




CPnQ17 6 


215286 


216518 


F 


CT153 hypothetical protein 


CPrlM.77 


217459 


216608 


R 




CPh#78 


218052 


217789 


R 




CPn0179 


218403 


218056 


R 




CPn0180 


218851 


218355 


R 




CPn0181 


219175 


218777 


R 




CPn0182 


220695 


219334 


R 


accC-Biotin Carboxylase- (CT124 J 


CPn0183 


221195 


220695 


R 


accB-Biotin Carboxyl Carrier Protein- {CTl 23 ) 


CPn0184 


221775 


221221 


R 


efp_l-Elongation Factor P_1-(CT122) 


CPn0185 


222451 


221755 


R 


rpe/araD-Ribulose-P Epixnerase- (CT121) 


CPn0186 


222899 


224068 


F 


♦similarity to Cps IncA^l- (CT119) 


CPn0187 


224248 


225045 


F 


predicted methylase- {CT133 ) 


CPn0188 


225111 


226400 


F 


CT132 hypothetical protein 


CPn0189 


226400 


229825 


- F 


CT131 homolog- (Possible Transmembrane Protein) 


CPn0190 


229919 


231274 


F 




CPn0191 


231991 


231314 


R 


glnQ-ABC Amino Acid Transporter ATPase- (CT130) 


CPn019 2 


232634 


231984 


R 


glnP-ABC Amino Acid Transporter Permease- (CT12 9) 


CPn0193 


233126 


232686 


R 


*argR-Arginine Repressor 


CPn0194 


233210 


234241 


F 


gcp_2-0-Sialoglycoprotein Endopeptidase„2- (CT197) 


CPn0195 


234190 


235785 


F 


oppA_l -Oligopeptide Binding Protein_l 


CPn0196 


235939 


237519 


F 


oppA^_2 -Oligopeptide Binding Protein_2- (CT198 ) 


CPn0197 


237578 


238882 


F 


oppA_3 -Oligopeptide Binding Protein_3 


CPn0198 


239169 


240746 


F 


oppA„4 -Oligopeptide Binding Protein_4 


CPn0l99 


241042 


241983 


F 


oppB_l -Oligopeptide Permease_l- {CTl 99) 


C?n0200 


242017 


242868 


F 


oppC_l -Oligopeptide Permease_l- {CT200) 


CPn0201 


242864 


243715 


F 


oppD-Oligopeptide Transport ATPase- (CT201 ) 


CPn0202 


243715 


244500 


F 


oppF-Oligopeptide Transport ATPase- (CT202 ) 


CPn02O3~- 


245008 


245802 


F 




CPn0204 


245817 


246002 


F 




CPn0205 


246133 


246327 


F 




CPn02O6 


246409 


247161 


F 


CT203 hypothetical protein 


CPn0207 


247208 


248617 


F 


ybhl/sodiTl-Oxoglutarate/Malate Trans locator- (CT204 ) 


CPn02O8 


248953 


250602 


F 


pf)cA_2-Fructose-6-P Phosphotrans ferase_2- (CT205) 


CPn0209 


251036 


251272 


F 
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CPn0210 


252384 


251440 


R 




CPn0211 


252756 


252463 


R 




CPn02l2 


254066 


252888 


R 




CPn0213 


254342 


254190 


R 




CPn0214 


255657 


254446 


R 




CPn0215 


257015 


255759 


R 




CPn02l6 


257608 


257174 


R 




CPn0217 


257896 


258579 


F 


ypdP- (CT140) 


CPn0213 


259058 


258582 


R 




CPn0219 


259357 


260472 


F 


tgt-Queuine tRNA Ribosyl Transferase- {CT193 ) 


CPn0220 


260696 


261238 


F 




CPn0221 


261657 


262064 


F 




CPn0222 


262504 


262842 


F 


•weak similarity to Bacteriophage CHP1 (Orf4) 


CPn0223 


262956 


263333 


F 




CPn0224 


263435 


263674 


F 




CPn0225 


263873 


264541 


F 




CPn0226 


264566 


264967 


F 




CPn0227 


265416 


265009 


R 


dsbB-Disulfide bond Oxidoreductase- (CT176 ) 


CPn0228 


266110 


265412 


R 


dsbG-Di sulfide Bond Chaperone- (CT177) 


CPn0229 


266328 


267560 


F 


CT178 hypothetical protein 


C£*i0230 
C&0231 


268253 


267576 


R 


CT17 9 hypothetical protein 


268957 


268253 


R 


tauB-ABC Transport ATPase (Nitrate/Fe) - (CT180) 


cfef023 2 


270122 


269232 


R 


•similarity to 5 ' -Methyl thioadenosine / S-AdenosylKomocysteine 
Nucleosidase 


cfe|023 3 


270424 


270248 


R 




CPfl;023 4 


271240 


270548 


R 


CT181 hypothetical protein 


CPn023 5 


271416 


272177 


F 


kdsB-deoxyoctulonosic Acid Synthetase- {CT18 2) 


Cfn023 6 


272156 


273766 


F 


pyrG-CTP Synthetase- (CTl 83 ) 


cffio237 


273762 


274214 


F 


yggF Family- (CTl 84 ) 


Ctf|0238 


274303 


275838 


F 


zwf -Glucose- 6 -P Dehyrogenase- (CT185) 


CPn0239 


275899 


276672 


F 


devB-Glucose-6-P Dehyrogenase { DevB family) - {CT18 6) 


CPn0240 


277B61 


276698 


R 




C?n0241 


279354 


278203 


R 




CM : 0242 


279918 


279487 


R 




CM0243 


280555 


280133 


R 




CPn0244 


280918 


281556 


F 


adk -Adenylate Kinase- (CT128 ) 


C?£024 5 


281645 


282499 


F 


ydhO-Polysaccharide Hydrolase- Invasin Repeat Family- {CT127 ) 


cMrf0246 


282952 


282551 


r' 


rs9-S9 Ribosomal Protein- (CT126 ) 


CPf|0247 


283415 


282969 


R 


rll3-L13 Ribosomal Protein- {CT125 ) 


CPn0248 


284327 


283650 


R 


ycfV/ybbA-ABC Transporter ATPase- (CTl 52) 


CPn0249 


285841 


284333 


R 


CT151 hypothetical protein 


CPn0250 


286057 


285902 


R 


rl33-L33 Ribosomal Protein- (CT150 ) 


CPn0251 


286060 


287559 


F 


•conserved hypothetical protein 


CPn0252 


288112 


287575 


R 


CT144 hypothetical protein {frame-shift with 0253?) 


CPn0253 


288456 


287950 


R 


CT144 hypothetical protein_l 


CPn0254 


289262 


288459 


R 


CT143 hypothetical protein_l 


CPn0255 


290165 


289329 


R 


CT142 hypothetical protein_l 


CPn0256 


291264 


290398 


R 


CT144 hypothetical protein_2 


CPn0257 


292127 


291267 


R 


CT143 hypothetical protein_2 


CPn0258 


292534 


292133 


R 


CT142 hypothetical protein (frame-shift with 0259?) 


CPn0259 


292986 


292441 


R 


CTl 4 2 hypothetical protein_2 


CPn0260 


294045 


293548 


R 


secA_l -Protein Translocase Subunit_l- (CT141 ) 


CPn0261 


294302 


295033 


F 


ydaO-PP-Loop Superfamily ATPase- (CT217) 


CPn0262 


295091 


295933 


F 


surE-SurE-like Acid Phosphatase- {CT218 ) 


CPn0263 


296249 


297136 


F 


yqfU hypothetical protein- (CT221) 


CPn0264 


297730 


297155 


R 


ubiD-Phenylacrylate Decarboxylase- (CT22 0) 


CPn0265 


298620 


297730 


R 


ubiA-Benzoate Oc taphenyl trans f erase- (CT219) 


CPn0266 


299184 


299876 


F 




CPn0267 


300122 


300910 


F 




CPn0268 


300935 


301318 


F 




CPn0269 


302450 


301476 


R 


Dipeptidase- (CT138) 


CPn0270 


303325 


302468 


R 


ywlC-SuA5 Superfamily-related Protein- (CT137 ) 


CPn0271 


303634 


304362 


F 


Lysophospholipase esterase- (CT136) 


CPn0272 


305233 


304340 


R 


dnaX_2-DNA Pol III Gamma and Tau_2- (CTl 8 7 ) 


CPn0273 


305844 


305227 


R 


tdk-Thymidylate Kinase- (CT188 ) 


CPn0274 


308353 


305852 


R 


gyrA_l-DNA Gyrase Subunit A_1-(CT189) 


CPn0275 


310786 


308372 


R 


gyrB_l-DNA Gyrase Subunit B_1-(CT190) 


CPn0276 


311137 


310793 


R 


CT191 hypothetical protein 


CPn0277 


311910 


311404 


R 




CPn0278 


312875 


312060 


R 


•conserved outer membrane lipoprotein protein 


CPn0279 


313537 


312875 


R 


•Possible ABC Transporter Permease Protein 


CPn0280 


314572 


313550 


R 


dppF-Dipeptide Transporter ATPase- (CT68 9 ) 
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CPn0281 


315057 


316103 


F 


CPn0282 


316125 


317529 


F 


CPn0283 


318497 


317532 


R 


CPn0284 


319045 


318551 


R 


CPn0285 


320595 


319051 


R 


CPn0286 


322059 


320650 


R 


CPn0287 


324221 


322089 


R 


CPn0288 


325716 


324571 


R 


CPn0289 


325812 


326996 


F 


CPn0290 


327042 


328523 


c 


CPn0291 


328667 


329194 


F 


CPn0292 


329228 


329836 


p* 


CPn0293 


329949 


332723 


r* 


CPn0294 


333092 


333502 


P' 


CPn0295 


333863 


333627 


R 


CPn0296 


334765 


334022 


R 


CPn0297 


335697 


334774 


:i 


CPn0298 


336721 


335717 


H 


CPn0299 


336816 


337415 


F 


CPn0300 


337783 


340152 


V 


CPn0301 


340250 


340762 


r 


(flh0302 


340787 


341866 


r 


Cgri0303 


342958 


341921 


h 


cin0304 


343133 


344158 


i 


CPh0305 


344154 


345137 


i 


&Mi0306 


345145 


346431 


j 


<fl^0307 


34B986 


346515 


i. 


&Bn0308 


349234 


349596 


F 


qmo3Q9 


350974 


349595 


R 


d&L0310 


• 353433 


351049 


R 


<§t>h0311 


354438 


353575 


R 


<£Pn0312 


354524 


354976 


F 


C|£n0313 


354990 


355355 


F 


41X10314 


356285 


355353 


R 


<!pn0315 


356977 


358716 


F 


<ffh0316 


358820 


360121 


F 


#Pn0317 


360081 


362750 


f' 


G^tl0318 


362767 


363126 


F 


dl|n0319 


363175 


363879 


F 


dtti032Q 


363860 


364783 


F 


CPn0321 


365858 


364767 


R 


CPn0322 


366249 


367328 


F 


CPn0323 


367331 


369460 


F 


CPn0324 


369492 


370688 


F 


CPn0325 


370708 


371148 


F 


CPn0326 


371148 


372725 


F 


CPn0327 


372945 


373211 


F 


CPn0328 


373241 


374992 


F 


CPn0329 


375088 


376146 


F 


CPn0330 


376675 


376202 


R 


CPn0331 


378437 


376701 


• R 


CPn0332 


378655 


378536 


R 


CPn0333 


379090 


378800 


R 


CPn0334 


379311 


379823 


F 


CPn033 5 


379817 


360674 


F 


CPn0336 


380650 


381591 


F 


CPn0337 


382027 


381575 


R 


CPn0338 


382278 


383375 


F 


CPn0339 


383420 


384034 


F 


CPn0340 


383842 


384156 


F 


CPn0341 


384160 


384495 


F 


CPn0342 


384622 


385052 


F 


CPn0343 


. 84999 


385595 


F 


CPn0 3 44 


387420 


385558 


R 


CPn0345 


388572 


387436 


R 


CPn0346 


389675 


388704 


R 


C?n0347 


391021 


389678 


R 


CPn0348 


391803 


391027 


R 


CPn0349 


392770 


391790 


R 


CPn0350 


393181 


393684 


F 


C?n0351 


393888 


395432 


F 



dhnA- Predicted 1,6-Fructose Biphospht.-*i Aldolase <d«hydrin family) ■ 

(CT215) 

xasA/gadC-Ammo Acid Transporter- (CT216 ) 



mgtE-Mg++ Transporter { CBS Domain) - {CT194 ) 
CT19 5 hypothetical protein 

aaaT-Neutral Ammo Acid (Glutamate) Transporter- (CT230) 
Na-dependent Transporter- (CT231 ) 
incB-Inclusion Membrane Protein B-{CT232) 
incC-Inclusion Membrane Protein CXCT233) 
CT234 hypothetical protein 

cAMP-Dependent Protein Kinase Regulatory Subunit- (CT23 5 ) 

acpP-Acyl Carrier Protein- (CT236) 

fabG-Oxoacyl {Carrier Protein) Reductase- (CT237) 

fabD-Malonyl Acyl Carrier Transcyclase- {CT238 ) 

fabH-Oxoacyl Carrier Protein Synthase III-{CT239) 

recR-Recombination Protein- (CT240) 

yaeT-Omp85 Analog- (CT241) 

{OmpH-Like Outer Membrane Protein) - (CT242 ) 
lpxD-UDP Glucosamine N-Acyl trans f erase- (CT2 43 ) 
CT24 4 hypothetical protein 

pdhAVodpA- Pyruvate Dehydrogenase Alpha- (CT245 ) 
pdhB/odpB- Pyruvate Dehydrogenase Beta-{CT24 6) 
pdhC-Dihydrolipoamide Acetyl transferase- (CT247) 
glgP-Glycogen Phosphorylase- (CT248 ) 
similarity to CT249 

dnaA_l -Replication Initiation Protein_l- (CT250 ) 
60IM-601cDa Inner Membrane Protein- (CT251) 
Igt-Prolipoprotein Diacylglycerol Transf erase- (CT252 ) 
CT101 hypothetical protein 

acpS-Acyl-carrier Protein Synthase- {CT100 ) 
trxB-Thioredoxm Reductase- {CT099 ) 
rsl-Sl Ribosomal Protein- (CT098 } 
nusA-N Utilization Protein A-{CT097) 
infB-Initiation Factor-2- (CT096) 
rbfA-Ribosome Binding Factor A-(CT095) 
truB-tRNA Pseudouridine Synthase- (CT094 ) 
ribF-FAD Synthase- <CT0 93} 
ychF-GTP Binding Protein- {CT0 92 ) 
yscU-YopS Translocation Protein U - (CT0 91 ) 
lcrD- Low Calcium Response D-{CT090) 
lcrE- Low Calcium Response E-(CT089) 
syc£-Secretion Chaperone- {CT088 ) 
malQ-Glucanotransf erase- (CT087) 
rl28-L28 Ribosomal Protein- {CT086) 
CT085 hypothetical protein 

Phopholipase D Super family [leader (33) peptide] - (CT084) 

CT083 hypothetical protein 

CT082 hypothetical protein 

CHLTR T2 Protein- (CT081) 

ItuB- (CT080) 

CT07 9 similarity 

folD-Methylene Tetrahydrof olate Dehydrogenase- {CT07 8 ) 
yojL- (CT077) 

smpB- Small Protein B-(CTQ76) 
dnaN-DNA Pol III (beta chain) -{ CT075 ) 
recF-ABC superfamily ATPase- (CT074 } 

{frame-shift with 0339) 

(frame-shift with 0340) 

predicted OMP [leader (19) peptide] - (CT073 ) 
(frame-shift with 0342?) 
yaeL-Metalloprotease- (CT072 ) 
yaeM- (CT071) 

troD/ytgD-Integral Membrane Protein- (CT070 ) 
troC/ytgC-Integral Membrane Protein- {CT069 ) 
troB/ytgB-ABC transporter ATPase- (CT068 ) 
troA/ytgA-Solute Protein Binding Fami ly- ( CT067 ) 
CT066 hypothetical protein 
adt_l-ADP/ATP Trans locase_l- (CT06 5) 
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CPn0352 


395574 


396830 


F 


CPn0353 


396893 


397135 


F 


CPn03 54 


397167 


398507 


F 


CPn0 35 5 


399889 


398591 


R 


CPn0 35 6 


400459 


400109 


R 


CPn0357 . 


401317 


400469 


R 


CPn035B 


401751 


401578 


R 


CPn0359 


402012 


403817 


F 


CPn0360 


405358 


403922 


R 


CPr.0361 


406647 


405382 


R 


GPn0362 


407825 


407055 


R 


CPn0363 


409688 


407943 


R 


CPn0364 


409966 


410238 


F 


CPn0365 


410528 


411544 


F 


CPn0366 


411976 


412440 


F 


CPn0367 


413102 


413836 


F 


CPn0368 


413790 


414107 


F 


CPn0369 


414351 


415562 


F 


CPn0370 


415800 


416912 


F 


CPn0371 


417147 


417503 


F 


Cm0312 


417687 


418001 


F 


c£g0373 


418380 


420218 


F 


Ciy0374 


420218 


420961 


F 


C£p0375 


421121 


421615 


F 


Cf>np376 


421854 


422294 


F 


c|S)377 


423438 


422347 


R 


cMi : 0378 


426168 


423445 


R 


CFn :; 0379 


426322 


426765 


F 


C|f|0380 


426758 


427876 


F 


Cfp;03 81 


429809 


428037 


R 


CPn03 82 


430749 


430036 


R 


Cf>n0383 


431693 


430749 


R 


C§#0384 


432377 


431862 


R 


CM-0385 


434018 


432522 


R 


Cfca0386 


434525 


434046 


R 


Cfi0387 


435196 


434699 


R 


cfn03 88 


435329 


437320 


F 


CPBo389 


438134 


437319 


r" 


C?g:0390 


439144 


438134 


R 


CPn0391 


439692 


439510 


R 


CPn0392 


439814 


440383 


F 


CPn0393 


440379 


440723 


F 


CPn0394 


440736 


441968 


F 


CPn0395 


441964 


443175 


F 


CPn0396 


444353 


443241 


R 


CPn0397 


445115 


444381 


R 


CPn0398 


445533 


445700 


F 


CPn0399 


445879 


446523 


F 


CPn0400 


446536 


447306 


F 


CPn0401 


447884 


447495 


R 


CPn0402 


448994 


447888 


R 


CPn0403 


449015 


449710 


F 


CPn0404 


450887 


449871 


R 


CPn0405 


451739 


450966 


R 


CPn0406 


451969 


452865 


F 


CPn0407 


453742 


452858 


R 


CPn0408 


454105 


454581 


F 


CPn0409 


454645 


455127 


F 


CPn0410 


455123 


455833 


F 


CPn0411 


455833 


456609 


F 


CPn0412 


456590 


457246 


F 


CPn0413 


459203 


457227 


R 


CPn0414 


460143 


459172 


R 


CPn0415 


461498 


460221 


R 


CPn0416 


461856 


461557 


R 


CPn0417 


463035 


462244 


R 


CPn0413 


464401 


462953 


R 


CPn0419 


466834 


464876 


R 


CPn0420 


467108 


466824 


R 


CPn0421 


467998 


467108 


R 


CPn0422 


4£8242 


468784 


F 


CPn0423 


468791 


469216 


F 



lepA-GTPase- {CT064 } 

gnd-6-Phosphogluconate Dehydrogenase- (CT063 ) 
tyrS-tyrosyl tRNA Synthetase- (CTO 62 ) 
f liA/rpsD-Sigma-28/WhiG Family- (CT061} 
flhA-Flagellar Secretion Protein- (CT060) 
fer4-Ferredoxin IV-{CT05 9) 



CTO 5 8 hypothetical protein_2 
CTO 5 8 hypothetical protein_3 



gcpE-(CT057) 

CT056 hypothetical protein 

sucB_l-Dihydrolipoamide S uc c iny 1 transfer as e — 1- (CT055) 
sucA-Oxoglutarate Dehydrogenase- (CT054) 
CTO 5 3 hypothetical protein 

hemN_l -Coproporphyrinogen III Oxidase„l- {CT052 } 
CT3 2 6 similarity 

yabC/yraL-SAM-Dependent Methytransf erase- (CT048) 
CT047 hypothetical protein 
hctB-Histone-like Protein 2-(CT046) 
pepA-Leucyl Aminopeptidase A-(CT045) 
ssb-SS DNA Binding Protein- (CT044 ) 
CT043 hypothetical protein 

glgX-Glycogen Hydrolase (debranching) - (CT042) 
CT041 hypothetical protein 
ruvB-Holliday Junction Helicase- (CT040) 

dcd-dCTP Deaminase- <CT03 9) 
CT03 8 hypothetical protein 

tlyC_l-CBS Domain protein (Hemolysin Homolog ) _1- (CT2 56 ) 
CT257 hypothetical protein 
yhfO-NifS-related protein- {CT258 } 
PP2C phosphatase f amily- (CT259 } 

CT2 53 hypothetical protein 
CT2 54 hypothetical protein 
CT2 55 hypothetical protein 
mutY-Adenine Glycosylase- (CT107 ) 

yceC-predicted pseudouridine synthetase family- (CT106) 
CT10 5 hypothetical protein 

fab I -Enoyl-Acyl -Carrier Protein Reductase- {CT104 ) 
HAD superfamily hydrolase /phosphatase- (CTl 03 ) 
CT102 hypothetical protein 
CT2 60 hypothetical protein 

dnaQ_l - DNA Pol III Epsilon Chain_l- { CT261 ) 

CT262 hypothetical protein 

CT263 hypothetical protein 

msbA-Transport ATP Binding Protein- (CT264 J 

accA-AcCoA Carboxylase/Transferase Alpha - { CT2 6 5 ) 

CT2 66 hypothetical protein 

himD/ihfA-Integration Host Factor Alpha- {CT267 } 
amiA-N-Acetylmuramoyl Alanine Amidase- {CT268 ) 
murE-N-Acetylmuramoylalanylglutamyl DAP Ligase- (CT2 69 ) 
pbp3- transglycolase/transpeptidase- (CT270) 
CT271 hypothetical protein 

yabC-PBP2B Family .methyl trans f erase- (CT272J 
CT273 hypothetical protein 
CT274 hypothetical protein 



CPn0424 


469612 


470961 


F 


CPn0425 


470980 


471564 


F 


CPn0426 


472111 


471536 


R 


CPn0427 


472207 


473715 


F 


CPn0428 


473722 


474681 


F 


CPn0429 


474681 


475319 


F 


CPnO430 


475326 


476093 


F 


CPn0431 


476483 


476151 


R 


CPn04 3 2 


476816 


476514 


R 


CPn0433 


477273 


476929 


R 


CPn0434 


479462 


477276 


R 


CPn0435 


480902 


479475 


R 


CPn0436 


481618 


480902 


R 


CPn0437 


481816 


484350 


F 


CPn0438 


485416 


484334 


R 


CPn0439 


485553 


486077 


F 


CPn0440 


486105 


486740 


F 


CPn0441 


486891 


487838 


F 


CPn0442 


488013 


488528 


F 


CPn0443 


488729 


489979 


F 


CPn0444 


490287 


494507 


F 


CPT^0445 


494772 


497579 


F 


cfll044 6 


497626 


500415 


F 


CM^P 4.4 7 


500568 


503351 


F 


CP|r044 8 


504810 


503698 


R 


05^44 9 

^— 4L fori?-* 


507231 


505330 


R 


rprs04 50 


508112 


507180 


R 


Cfn04 51 


508275 


511058 


F 


c£n04 52 


511319 


512860 


F 


cMi*04 53 


513234 


516152 


F 


CS&0454 


516182 


519115 


F 


CPn0455 


520348 


519458 


R 


CPnG456 


521532 


520327 


R 


cS0457 


523865 


522120 


R 


cW0458 


526320 


524236 


R 


CF&.0459 


527005 


526619 


R 


CPn.0460 


527840 


526992 




cK04 61 


528638 


527844 


R 


CFy0462 


531052 


529037 


R 


Cpii0463 


532357 


531191 


R 


CPn0464 


532842 


532366 


R 


CPn0465 


533212 


532871 


R 


CPn0466 


533724 


536537 


F 


CPn04.67 


536633 


539434 


F 


CPn0468 


539632 


540432 


F 


CPn0469 


5403 99 


541460 


F 


CPn0470 


541357 


542532 


F 


CPn0471 


542564 


545401 


F 


CPn0472 


547905 


545581 


R 


CPn0473 


549593 


548070 


R 


CPn0474 


551573 


549807 


R 


CPn0475 


553844 


551685 


R 


CPn0476 


554844 


553858 


R 


CPn0477 


556106 


554844 


R 


CPn0478 


557625 


556210 


R 


CPn0479 


558425 


557616 


R 


CPn0480 


559303 


558650 


R 


CPn0481 


560946 


559339 


R 


CPn0482 


561737 


560961 


R 


CPn0483 


561836 


564964 


F 


CPn0484 


564970 


565824 


F 


CPn0485 


566038 


566229 


F 


CPn0486 


567784 


566405 


R 


CPn0487 


569740 


568112 


R 


CPn0488 


570096 


569767 


R 


CPn0489 


570965 


570096 


R 


CPn0490 


571279 


573333 


F 


CPn0491 


574352 


573336 


R 


CPH04S2 


574652 


574804 


F 


CPn0493 


575004 


574855 


R 


C?n0494 


575364 


575146 


R 


C?n0495 


575603 


576793 


F 



dnaA_2 -Replication Initiation Factor.*- (CT275 ) 
CT276 hypothetical proteins 
CT277 similarity 

nqr2-NADH { Ubiquinone } Dehydrogenase- {CT27 8 } 
nqr3-NADH {Ubiquinone} Oxidoreductase , Gamma- (CT279) 
nqr4-NADH (Ubiquinone) Reductase 4-{CT280) 
nqr5-NADH (Ubiquinone) Reductase 5-(CT2 81) 



gcsH-Glycme Cleavage System H Protein- <CT2 82 ) 
CT283 hypothetical protein 

Phospholipase D superfamily [uncleavable leader peptide] - (CT284 } . 
IplA-Lipoate Protein Ligase-Like Protein- {CT2S 5) 
clpC-ClpC Protease- {CT286 ) 
ycbF-PP-loop superfamily ATPase- (CT287 } 



CT007 hypothetical protein 
CT006 hypothetical protein 
CT005 hypothetical protein 

pmp_6- Polymorphic Outer Membrane Protein G/I Family 
pmp-7- polymorphic Outer Membrane Protein X? Family 
pmp_8- Polymorphic Outer Membrane Protein G Family 
pmp_9- Polymorphic Outer Membrane Protein G/I Family 
*yxjG„Bs_2 Hypothetical Protein 
pmp_l 0 - PM?__1 0 (Frame-shift with 0451) 
pmp_10- Polymorphic Outer Membrane Protein G Family 
pmp_ll- Polymorphic Outer Membrane Protein G Family 
pmp_l 2 -Polymorphic Outer Membrane Protein A/I Family (truncated) 
pmp_13 -Polymorphic Outer Membrane Protein G Family 



pmp_ 


.14 


-Polymorphic 


Outer 


Membrane 


Protein 


H Family 






pmp_ 


.15 


-Polymorphic 


Outer 


Membrane 


Protein 


E Family 






pmp_ 


.16 


-Polymorphic 


Outer 


Membrane 


Protein 


E Family 






pmp. 


.17 


-Polymorphic 


Outer 


Membrane 


Protein 


E Family 


with 


0469) 


pmp_ 


.17 


-Polymorphic 


Outer 


Membrane 


Protein 


(Frame-shift 


pmp_ 


.17 


-Polymorphic 


Outer 


Membrane 


Protein 


(Frame-shift 


with 


0470) 


pmp_ 


.18 


-Polymorphic 


Outer 


Membrane 


Protein 


E/F Family 







CT3 65 hypothetical protein 
glgB-Glucan Branching Enzyme- {CT866) 
CT865 hypothetical protein 
*yqeV_Bs Hypothetical Protein 
hflX-GTP Binding Protein- (CT379 ) 
phnP-Metal Dependent Hydrolase- (CT380) 
CT383 hypothetical protein 

artJ-Arginine Periplasmic Binding Protein- {CT381 ) 

aroG-Deoxyheptonate Aldolase- (CT382) 
CT3 82.1 hypothetical protein 
•hypothetical proline permease 
CT3 84 hypothetical protein 
hitA-HIT Family Hydrolase- (CT385) 
CT3 86 hypothetical protein 
CT387 hypothetical protein 
CT389 hypothetical protein 



aspC-Aspartate Aminotransf erase- (CT3 90 i 
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CPn0496 


576793 


577812 


F 


CPn0497 


578089 


577820 


R 


CPn0498 


579035 


578085 


R 


CPn0499 


580359 


579205 


R 


CPnOSOO 


580659 


582362 


F 


CPnOSOl 


582457 


583650 


F 


CPn05O2 


583650 


584201 


F 


CPn05O3 


584234 


586213 


F 


CPn0504 


5864B7 


588514 


F 


CPnOSOS 


588519 


589106 


F 


CPn0506 


589172 


589840 


F 


CPnQ507 


589961 


590122 


F 


CPnOSOS 


590142 


590300 


F 


CPn0509 


590335 


590808 


F 


CPnOSlO 


590813 


591973 


F 


CPnOSll 


592141 


592488 


F 


CPn0512 


592553 


594412 


F 


CPn0513 


594647 


595753 


F 


CPn0514 


595729 


596520 


F 


CPn0515 


596492 


597181 


F 


CPn0516 


598814 


597255 


R 


CPn0517 


599631 


598795 


R 


dgx0518 


600803 


599832 


R 


GP&0519 


601674 


600904 • 


R 


Cfa0520 


602218 


601646 


R 


cfe0521 


603797 


602241 


R 


d^l0522 


603987 


604655 


F 


din0523 


604723 


605052 


F 


C£n0524 


605103 


606179 


F 


cfenO 525 


606522 


607283 


F 


c¥n0526 


608696 


607710 


R 


dkSo527 


609 904 


608726 


R 


r pn n *=; ? a 


611162 


609921 


R 


UiUi U J* ? 


612259 


611165 


R 


rVnO^*? 0 

WsEr SI w J J w 


613254 


6124 60 


R 


dPri0531 


614069 


613245 


R 


0^0532 


614674 


614075 


R 


Cfe&053 3 


614930 


615385 


F * 


CPn0 534 


615413 


615784 


F 


C"M053 5 


615793 


616296 


F 


<fM0536 


616345 


617691 


F 


CPn0537 


617833 


618189 


F 


CPn0538 


618212 


618511 


F 


CPn053 9 


618705 


621545 


F 


CPn0540 


621694 


626B62 


F 


CPn0541 


627170 


628003 


F 


CPn0542 


628003 


628737 


F 


CPn0543 


628725 


629603 


F 


CPn0544 


630529 


629525 


R 


CPn0545 


630884 


630633 


R 


CPn0546 


631229 


630912 


R 


CPn0547 


631661 


632188 


• F 


CPn0548 


633231 


632191 


R 


CPn0549 


633569 


• 633255 


R 


CPn0550 


635661 


633580 


R 


CPn05 51 


636168 


635698 


R 


CPn0552 


636587 


636219 


R 


CPn0553 


637747 


636812 


R 


CPn0 554 


637854 


638141 


F 


CPn05 55 


638298 


640241 


F 


CPn0556 


640912 


640325 


R 


CPn0557 


642861 


641194 


R 


CPn05 58 


643300 


643031 


R 


CPn0559 


643742 


643927 


F 


CPn0560 


645612 


644098 


R 


CPn0561 


646404 


645871 


R 


CPn05S2 


648036 


646918 


R 


CPn0563 


650056 


648293 


R 


CPn0564 


654350 


650145 


R 


CPn0565 


655630 


654533 


R 


CPn0566 


656141 


656890 


F 


CPn0567 


656894 


657817 


F 



CT391 hypothetical protein 
CT388 hypothetical protein 



proS-Prolyl tRNA Synthetase- {CT3 93) 

hrcA-HTH Transcriptional Repressor- {CT3 94 ) 

grpE-HSP-70 Cof actor- (CT3 95 ) 

dnaK-HSP-70- (CT396) 

vacB-ribonuclease family- (CT3 97) 

•3 -methyl adenine DNA glycosylase 

CT421 hypothetical protein 

CT421.1 hypothetical protein 

CT421.2 hypothetical protein 

(predicted Metal loenzyme) - {CT422 ) 

tlyC_2-CBS Domains (Hemolysin homolog}_2- (CT423 ) 

rsbV_l-Sigma Regulatory Factor_l- (CT424 ) 

CT425 hypothetical protein 

Fe-S oxidoreductase_l-<CT426) 

CT4 27 hypothetical protein 

ubiE-Obiquinone Methyltransf erase- (CT428) 



CT42 9 hypothetical protein 
dapF-Diaminopimelate Epimerase- CCT430) 
clpP-CLP Protease- (CT431) 

glyA-Serine Hydroxyme thy 1 transferase- (CT432) 
CT433 hypothetical protein 



CT39 8 hypothetical protein 

yrbH-GutQ/KpsF Family Sugar-P Isomerase- (CT399 ) 

sucB_2-Dihydrolipoamide Succinyl transferase^- (CT400) 

gltT-Giutamate Symport- (CT4 01 ) 

ycaH-ATPase- (CT402) 

spoU_l-rRNA Methylase_l- (CT403) 

SAM dependent methyltransf erase- (CT4 04) 

ribC/risA-Ribof lavin Synthase- (CT405) 

CT406 hypothetical protein 

dksA-DnaK Suppressor- { CT4 07 ) 

IspA-Lipoprotein Signal Peptidase- (CT4 08 } 

dagA_l-D-Ala/Gly Permease_l- (CT409) 

CT814.1 hypothetical protein 

CT814 hypothetical protein 

prnp_19 -polymorphic outer membrane protein A Family -(CT412) 
pmp_20 -polymorphic outer membrane protein B Family- {CT413 ) 
Solute binding protein { -yebL-Synechocystis Adhesin Homolog) - (CT415) 
ABC Transporter ATPase- (CT416 ) 
(Metal Transport Protein) - (CT417) 
yhbZ-GTP binding protein- <CT418 ) 
rl27-L27 ribosomal protein- (CT419) 
rl21-L21 Ribosomal Protein- (CT420) 
ygbB f amily- (CT4 34 ) 
cysJ-Sulfite Reductase- (CT43 5) 
rslO-SlO Ribosomal Protein- {CT43 6} 
fusA-Elongation Factor G-{CT437) 
rs7-S7 Ribosomal Protein- (CT43 8 ) 
rsl2-S12 Ribosomal Protein- (CT43 9) 

CT440 hypothetical protein 
tsp-Tail-Specif ic Protease- (CT441) 
crpA-15kDa Cysteine-Rich Protein- (CT442 ) 

omcB-60JcDa Cysteine-Rich Outer Membrane Complex Protein- (CT44 3 ) 

omcA-9kDa-Cysteine-Rich Outer Membrane Complex Lipoprotein- (CT4 44 ) 

CT4 41.1 hypothetical protein 

gltX-Glutamyl-tRNA Synthetase- (CT445) 

euo-CHLPS Euo Protein- (CT446 } 

'CHLPS 4 3 JcDa protein homolog_l 

recJ-ssDNA Exonuclease- (CT447 ) 

secD&secF-Protein Export Proteins SecD/SecF ( fusion) - (CT448) 
CT44 9 hypothetical protein 
yaeS f ami ly- < CT4 50 ) 

cdsA-Phosphatidate Cyt idyly trans f erase- (CT451) 
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CPn0568 


657817 


658464 


F 


cdsA-Phosphatidate Cytidylytrans f erase- (CT452) 


CPn0569 


658464 


659099 


F 


plsC-Glycerol-3-P Acyl trans f erase- (CT453) 


CPn0570 


659107 


660789 


F 


argS-Argmyl tRNA Transferase- ( CT4 54 ) 


CPn0571 


662122 


660749 


R 


murA-UDP-N-Acetylglucosamme Transferase- (CT455) 


CPn0572 


662352 


664616 


F 


CT456 hypothetical protein 


CPn0573 


665404 


664691 


R 


yebC family- (CT4 57) 


CPn0574 


665945 


665394 


R 


YhhY-Ammo Group Acetyl Transferase- (CT453 ) 


CPn0575 


666494 


665982 


R 


CPn0 57 6 


667543 


666494 


R 


prfB-Peptide Chain Release Factor 2 (natural UGA frame-shift )-{CT45 


CPn0576 


667598 


667530 


R 


prfB- (natural UGA frame-shift ) 


CPn0577 


667895 


668155 


F 


SWIB (YM74) complex protein- (CT4 60) 


CPn0578 


668406 


669365 


F 


yael-phosphohydrolase- (CT4 61 ) 


CPn0579 


669361 


669993 


F 


ygbP/yacM- Sugar Nucleotide Phosphorylase- {CT462 ) 


CPn0580 


669993 


670793 


F 


truA-Pseudouridylate Synthase I-(CT463) 


CPn0581 


671434 


670745 


R 


Phosphoglycolate Phosphatase- (CT4 64 ) 


CPn0582 


671503 


672177 


F 


CT4 6 5 hypothetical protein 


CPn0583 


672400 


672717 


F 


CT4 6 6 hypothetical protein 


CPn0584 


672707 


673798 


r 


atoS/ntrB-2-Component Sensor- (CT467) 


CPn0585 


675817 


673865 


I. 


♦similarity to Cps IncA_2 


CPn0586 


676026 


677183 


T 


atoC/ntrC- 2 -Component Regulator- (CT468) 


CPn0587 


677441 


678124 


F 


*yvyD_Bs conserved hypothetical protein 


CPn0588 


678084 


678626 


F 


CT469 hypothetical protein 


di|0589 


678640 


679395 


B 


CT47 0 hypothetical protein 


QFI10 590 


680112 


679516 


F 


CT471 hypothetical protein 


Cpi0591 


680373 


681020 


F 


yagE f axnily- (CT472 ) 


dfn05 92 


681153 


681461 


E 


yidD family- (CT473) 


3^0593 


682476 


681391 


I 


CT474 hypothetical protein 


C§gi05 94 


682583 


684958 


I 


pheT-phenylalanyl tRNA Synthetase Beta-(CT475) 


GBn0595 


684958 


685926 


F 


CT476 hypothetical protein 


GS&0596 


685939 


686457 


F 


ada-methyltransf erase- (CT477) 


C#0597 


688215 


686479 


R 


oppC_2 -Oligopeptide Permease_2- (CT478) 


(Jfio598 


689697 


688219 


R 


oppB__2 -Oligopeptide Permease_2- {CT479 ) 


GPn0599 


691802 


689682 


R 


oppA_5 -oligopeptide Binding Lipoprotein^- (CT480) 


GBn0600 


692147 


691827 


R 




dPn0601 


693053 


692736 


R 


CT483 hypothetical protein 


C#ti0602 


694105 


693104 


R 


CT484 hypothetical protein 


db*h0603 


694205 


695185 


F 


hemZ-Ferrochetalase- (CT485) 


GEn0604 


695945 


695196 


R % 


f liY-Glutamine Binding Protein- {CT486 ) 


Cfn0605 


696707 


696150 


R 


yhhF-Methylase -{CT487) 


C§i0606 


697444 


696707 


R 


CT488 hypothetical protein 


d§*n0607 


698895 


697573 


R 


glgC-Glucose-1-P Adeny 1 trans f erase- (CT489) 


CPn0608 


699645 


699016 


R 


♦pyrF-Uridme 5 ' -Monophosphate Synthase (Ump Synthase) -truncated? 


CPn0609 


699705 


699986 


F 


CT490 hypothetical protein 


CPn0610 


701420 


700029 


R 


rho-Transcription Termination Factor- (CT491 ) 


CPnG611 


702025 


701420 


R 


yacE-predicted phosphatase/kinase- (CT492) 


CPn0612 


704631 


702022 


R 


polA-DNA Polymerase I-(CT493J 


CPn0613 


705656 


704658 


R 


sohB-Protease- (CT494) 


CPn0614 


707402 


705783 


R 


adt__2 - ADP / ATP Translocase_2 - (CT49 5) 


CPn0615 


708137 


707634 


R 


pgsA_l -Glycerol -3 -P Phosphatidyl trans f erase_l- (CT496 ) 


CPn061G 


708791 


710137 


F 


dnaB-Replicative DNA Helicase- (CT497 ) 


CPn0617 


710484 


712316 


F 


gidA- FAD -dependent oxidoreductase- (CT498) 


CPn0618 


712306 


713010 


• F 


lplA-Lipoate-Protein Ligase A-{CT499) 


CPn0619 


713444 


713013 


R 


ndk-Nucleoside-2-P Kinase- (CT5 00 ) 


CPnO620 


714139 


713519 


R 


ruvA-Holliday Junction Helicase- (CT501) 


CPn0621 


714647 


714144 


R 


ruvC -Cross over Junction Endonuclease- {CT502) 


CPn0622 


715752 


714793 


R 


CT503 hypothetical protein 


CPn0623 


716993 


716163 


R 


CT504 hypothetical protein 


CPn0624 


718015 


717011 


R 


gapA-Glyceraldehyde-3-P Dehyrogenase- (CT505) 


CPn0625 


718485 


718060 


R 


r!17-L17 Ribosomal Protein- (CT506 ) 


CPn0626 


719616 


718495 


R 


rpoA-RNA Polymerase Alpha- (CT507 ) 


CPn0627 


720038 


719640 


R 


rsll-Sll Ribosomal Protein- {CT508 ) 


CPn0628 


720428 


720063 


R 


rsl3-Sl3 Ribosomal Protein- {CT509 ) 


CPn0629 


721857 


720487 


R 


secY-Translocase- {CT510} 


CPn0630 


*, 22316 


721885 


R 


rll5-Ll5 Ribosomal Protein- {CT511 ) 


CPn0631 


722806 


722312 


R 


rs5-S5 Ribosomal Protein- (CT512 ) 


CPn0632 


723195 


722827 


R 


r!18-L18 Ribosomal Protein- (CT513 ) 


CPn06 3 3 


723757 


723209 


R 


rl6-L6 Ribosomal Protein- { CTS 1 4 ) 


CPn063 4 


724185 


723787 


R 


rs8-S8 Ribosomal Protein- (CTS 1 5 ) 


C?n063 5 


724745 


724206 


R 


r!5-L5 Ribosomal Protein- {CT516 ) 


CPn0636 


725082 


724750 


R 


rl24-L24 Ribosomal Protein- (CT517 ) 


CPn06 3 7 


725464 


725099 


R 


rl!4-L14 Ribosomal Protein- ( CT518 ) 


CPr.0638 


725747 


72S490 


R 


rs!7-S17 Ribosomal Protein- < CT519 ) 
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CPn0639 


725958 


725743 


R 


rl29-L29 Ribosomal Procein- {CT520 ) 


CPn0640 


726377 


725964 


R 


rll6-L16 Ribosomal Protein- { CT521 ) 


C Pn 0 641 


727077 


726409 


R 


rs3-S3 Ribosomal Protein- (CT522J 


CPnQ642 


727428 


727096 


R 


rl22-L22 Ribosomal Protein- (CT523 ) 


C Pn 0 64 3 


727713 


727450 


R 


rsl9-S19 Ribosomal Protein- (CT524 ) 


rpnO 64 4 


728573 


727722 


R 


rl2-L2 Ribosomal Protein- {CT52 5 ) 


r pn 0 64 5 


728930 


728598 


R 


rl23-L23 Ribosomal Protein- (CT526 ) 


(™X>n 0 64 6 


729621 


728950 


R 


rl4-L4 Ribosomal Protein- (CT527 ) 


TPnO 647 


730331 


729657 


R 


rl3-L3 Ribosomal Protein- (CT528 ) 


P Pn 0 64 8 


731603 


730605 


R 


CT529 hypothetical procein 




732672 


731710 


R 


fmt-Methionyl tRNA Formyltransf erase- (CT530 ) 


L. Cil U03U 


733501 


732665 


R 


lpxA-Acyl-Carrier UDP-GlcNAc -(CT531) 


LrTivjo D J. 


733975 


733517 


R 


fabZ-Myristoyl-Acyl Carrier Dehydratase- {CT53 2 ) 


WimU 0 3 4 


7348 3 5 


733 990 


R 


lpxC-Myristoyl GlcNac Deacetylase- (CT533 > 


CPn0553 


7364 90 


734868 


R 


cutE-Apolipoprotein N-Acetyltransf erase- (CT534) 


r Pn n =\ 4 


736967 


736503 


R 


vdlD/yciA-acyl-CoA Thioesterase- {CT535) 


CPn0655 


737847 


737101 


R 


dnaQ_2-DNA Pol III Epsilon Chain_2 - {CT53 6 } 




737872 


738048 


F 




TPn06S7 


738473 


738051 


R 


yjeE (ATPase or Kinase} - (CT537 ) 


CPn0658 


739168 


738455 


R 


CT53 8 hypothetical protein 


CPn0659 


739533 


739838 


F 


trxA-Thioredoxin- (CT539) 


C£rl0660 


740327 


739860 


R 


spoU_2-rRNA Methylase_2 - (CT540 ) 


f^PWO 661 

C BiLO 662 


741100 


740327 


R 


mip-FKBP-type peptidyl -prolyl cis -trans isomerase- (CT541) 


742923 


741172 


R 


aspS-Aspartyl tRNA Synthetase- (CT5 42 ) 


Cptf0 663 

iw y W .J 


744190 


742901 


R 


hisS-Histidyl tRNA Synthetase- (CT543 ) 


C£%v0 664 


744757 


744557 


R 




Cf*n0665 


745001 


746365 


F 


uhpC-Hexosphosphate Transport -(CT544) 


\— ST,4.+.*J w W \J 


746388 


750107 


F 


dna£-DNA Pol III Alpha- { CT54 5 } 


CPb.0667 
c|§10668 


751058 


750177 


R 


predicted OMP [leader (17}~<CT546) 


751209 


752162 


F 


CT547 hypothetical protein 


cl*y0669 

>• *r "i i' w w v j 


752179 


752775 


F 


CT548 hypothetical protein 


ronH 67 0 


752765 


7 53196 




rsbW-sigma regulatory f actor-histidine kinase- {CT549 ) 




753630 


753205 


R 


CT5 50 hypothetical protein 


CPn0672 


753741 


755048 


F 


dacF (pbp5) -D-Ala-D-Ala Caroxypeptidase- {CT551) 


cl>n0 673 


755287 


755463 


F 


CT552 hypothetical protein 


01^0674 


756668 


755577 


R 


fmu-RNA Methyl trans f erase- (CT5 53) 


CjNi;0675 


757919 


756768 


R 


CT6 96 hypothetical protein 


C£$a067 6 


759217 


758051 


R * 


homologous to CT695 


cl§0677 


760401 


759256 


R 




ciy067 8 


761320 


760682 


R 




CPn0679 


762930 


761725 


R 


pgk-Phosphoglycerate Kinase- {CT693) 


CPn0680 


764248 


762971 


R 


ygo4 -Phosphate Permease- (CT6 92 ) 


CPnO 681 


764929 


764258 


R 


CT691 hypothetical protein 


CPn0682 


764984 


765955 


F 


dppD-ABC ATPase Dipeptide Transport- (CT690) 


CPn0683 


765948 


766919 


F 


dppF-ABC ATPase Dipeptide Transport- { CT6 89 ) 


CPnO 684 


768038 


767181 


R 


spoJ/parB -Chromosome Partitioning Protein- (CT688 ) 


CPn0685 


768068 


768217 


F 




CPn0686 


768361 


768176 


R 




CPn0687 


768564 


769214 


F 


CT4 82 hypothetical protein 


CPr.0688 


769382 


770137 


F 


CT4 81 hypothetical protein 


CPn0689 


771404 


770187 


R 


yfhO_l-Nif S-related Aminotransferase^- £CT687) 


CPn0690 


772680 


771436 


R 


ABC Transporter Membrane Protein- (CT68 6) 


CPn0691 


773452 


772685 


R 


abcX-ABC Transporter ATPase- (CT68 5 ) 


CPn0692 


774912 


773461 


R 


ABC Transport er - ( CT 6 8 4 ) 


CPn0693 


776256 


775240 


R 


TPR Repeats (O-Linked GlcNAc Transferase similarity) - (CT683 ) 


CPn0694 


779599 


776330 


R 


pbp2-PBP2-transglycolase/transpeptidase- (CT682) 


CPn0695 


780216 


781382 


F 


ompA-Major Outer Membrane Protein- {CT681) 


CPn0696 


781769 


782599 


F 


rs2-S2 Ribosomal Protein- (CT680) 


CPn0697 


782602 


783447 


F 


tsf-Elongation Factor TS-(CT679J 


CPn0698 


783458 


784201 


F 


pyrH-UMP Kinase- (CT679) 


CPn0699 


784182 


784721 


F 


rrf-Ribosome Releasing Factor- {CT677 ) 


CPn0700 


785097 


785609 


F 


CT67 6 hypothetical protein 


CPn0701 


785599 


786672 


F 


karG-Arginine Kinase- (CT67 5 ) 


CPn0702 


789685 


786929 


R 


yscC/gspD-Yop C/Gen Secretion Protein D-(CT674) 


CPn0703 


791190 


789685 


R 


pkn5-S/T Protein Kinase- {CT673 ) 


CPn0704 


792321 


791209 


R 


fliN- Flagellar Motor Switch Domain/YscQ f amily- (CT672 ) 


C?n0705 


793173 


792334 


R 


CT671 hypothetical protein 


CPn0706 


793683 


793180 


R 


CT670 hypothetical protein 


CPn0707 


795029 


793704 


R 


yscN-Yop N { Flagellar-Type ATPase >- (CT66 9 ) 


CPn0708 


795705 


795034 


R 


CT668 hypothetical protein 


CPn0709 


796188 


795742 


R 


CT667 hypothetical protein 


CPn07lO 


796461 


796210 


R 


CT666 hypothetical protein 
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CPn0711 


796731 


796486 


R 


CT665 hypothetical protein 


CPn07l2 


799315 


796781 


R 


FHA domain; homology to adenylate cyclase >- (CT664 ) 


CPn0713 


799721 


799332 


R 


CT663 hypothetical protein 


CPn0714 


801107 


800091 


R 


hemA-Glutamyl tRNA Reductase- ( CT662 ) 


CPn0715 


801657 


803462 


F 


gyrB_2 -DNA Gyrase Subunit B_2- (CT661 ) 


CPn0716 


803469 


804902 


F 


gy r A_2 - DNA Gyrase Subunit A_2-(CT660) 


CPn0717 


805010 


805306 


F 


CT656 hypothetical protein 


CPn0718 


805309 


805626 


F 


CT657 hypothetical protein 


CPn0719 


805916 


806890 


F 


s f hB- ( Pseudouridine Synthase } - (CT658 ) 


CPn0720 


807003 


807236 


F 


CT659 hypothetical protein 


CPn0721 


807683 


808489 


F 


kdsA-KDO Synthetase- (CT6 55) 


CPn0722 


808489 


808974 


F 


CT654 hypothetical protein 


CPn0723 


808984 


809703 


F 


yhbG-ABC Transporter ATPase- {CT653 } 


_CPn072 4 


810527 


809706 


R 




~CPn0725 


810811 


810587 


R 


CT652.1 hypothetical protein 


CPn0726 


813372 


810880 


R 


CT62 0 hypothetical protein 


CPn0727 


813577 


816192 


F 


CT619 hypothetical protein 


CPn0728 


818477 


816525 


R 


CHLPN 76kDa Homolog_l {CT622) 


CPn0729 


819857 


818592 


R 


CHLPN 76kDa Homolog_2 (CT623) 


CPn0730 


821603 


819963 


R 


mviN-Integral Membrane Protein- (CT624 ) 


CPn0731 


821587 


821760 


F 




CPn0732 


822098 


822976 


F 


nfo-Endonuclease IV-(CT625) 


CEg$733 


823727 


823101 


R 


rs4-S4 Ribosomal Protein- (CT626) 


CPn&734 


823944 


824915 


F 


yceA- (CT6271 


CPp735 


825668 


825003 


R 


*pyrH/udk- Uridine Kinase {Uridine Monophosphokinase) { Pyrimidine 










Ribonucleoside Kinase) . 


CEti!73 6 


827686 


825992 


R 


ygeD-Efflux Protein- {CT641 ) 


CFg®737 


827685 


830756 


F 


recC-Exodeoxyribonuclease V, Gamma- (CT640) 


CPn073 8 


830746 


833895 


F 


recB-Exodeoxyribonuclease V, Beta-(CT639) 


CPM73 9 


834871 


833861 


R 


CT638 hypothetical protein 


CPP0740 


'836048 


834864 


R 


tyrB-Aromatic AA Aminotransf erase- (CT637 ) 


CPH0741 


838350 


836185 


R 


greA-Transcription Elongation Factor- {CT63 6} 


CPn0742 


838463 


838888 


F 


CT63 5 hypothetical protein 


C£n0743 


838962 


840362 


F 


nqrA-Ubiquinone Oxidoreductase , Alpha- (CT634 ) 


CPfa0744 


841384 


840389 


R 


hemB-Porphobilinogen Synthase- {CT63 3 ) 


CPnt)745 


841903 


841742 


R 




CPjM>7 4 6 


841975 


843567 


F 


CT632 hypothetical protein 


CPtxQ7 47 


843675 


843740 


F* 


CT631 hypothetical protein 


01^0747 


843725 


843910 


F 


CT631 hypothetical protein ( frame -shift ) 


014^748 


844987 


844121 


R 


ispA-Geranyl Trans transferase- (CT628) 


CPtii749 


845629 


845006 


R 


glmU-UDP-GlcNAc Pyrophosphorylase- (CT629) 


CPn0750 


846411 


845707 


R 


tctD/cpxR-HTH Transcriptional Regulatory Protein + Receiver Doman- 










(CT630) 


CPn0751 


846608 


848434 


F 


CT6 51 hypothetical protein 


CPn0752 


848604 


850082 


F 


recD_2-Exodeoxyribonuc lease V, Alpha_2- (CT652 ) 


CPn0753 


851006 


850161 


R 




CPn0754 


851336 


851040 


R 


rs20-S20 Ribosomal Protein- {CT617 } 


CPn0755 


851597 


852799 


F 


CT616 hypothetical protein 


CPn0756 


852961 


854676 


F 


rpoD-RNA Polymerase Sigma-66 -(CT615) 


CPn0757 


854733 


855134 


F 


folX-Dihydroneopterin Aldolase- (CT614) 


CPn0758 


855110 


856459 


F 


folP/dhpS-Dihydropteroate Synthase- {CT613) 


CPn0759 


856488 


856997 


- F 


f olA-Dihydrof olate Reductase- (CT612 ) 


CPn0760 


856957 


857694 


F 


CT611 hypothetical protein 


CPn07 61 


857704 


858375 


F 


CT610 hypothetical protein 


CPn0762 


859597 


858539 


R 


recA-RecA recombination protein- (CT650) 


CPn0763 


860511 


859972 


R 


ygf A-Formyltetrahydro folate Cycloligase- (CT649) 


CPn0764 


861807 


860524 


R 


CT64 8 hypothetical protein 


CPn0765 


862382 


861801 


R 


CT647 hypothetical protein 


CPn07 6 6 


863782 


862394 


R 


CT64 6 hypothetical protein 


CPn0767 


863884 


864177 


F 


CT64 5 hypothetical protein 


CPn07 68 


864159 


865163 


F 


yohI/nir3 -predicted oxidoreductase -(CT644) 


CPn0769 


867733 


865121 


R 


topA-DNA Topoisomerase I-Fused to SWI Domain- (CT64 3 ) 


CPn0770 


868340 


869131 


F 


CT642 hypothetical protein 


CPn0771 


870463 


869144 


R 


rpoN-RNA Polymerase Sigma-54- (CT609) 


CPn0772 


872385 


870469 


R 


uvrD-DNA Helicase- (CT608) 


CPn077 3 


872488 


873195 


F 


ung-Uracil DNA Glycosylase- (CT607 ) 


CPn0774 


873195 


873425 


F 


CT606.1 hypothetical protein 


CPn0775 


874031 


873414 


R 


yggV f ami ly- (CT606 } 


CPn0776 


874246 


875487 


F 


CT605 hypothetical protein 


CPn0777 


875601 


877178 


F 


groEL_2-heat shock protein-60 -(CT604) 


CPn0778 


877505 


878092 


F 


tsa/ahpC-Thio-specific Antioxidant (TSA) Peroxidase- {CT603 ) 


CPn0779 


878481 


878095 


R 


CT602 hypothetical protein 
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879205 


B78 5 91 


R 


rPnD781 

til VJ > G ± 


879773 


879198 


R 


r t>-n n 7 fl 2 


881065 


879773 


R 


U fnU ' O J 


881885 


881100 


R 


C rnu / o ** 


882296 


881892 


R 


CPn0785 


882991 


882296 


R 


CPn0786 


883185 


885293 


F 


n T>r-> n7 Q 7 


885619 


886401 


F 


cpnu / ts o 


886542 


887432 


F 


CPnu / ts j 


88743 9 


889316 


F 


^ ri A Q A 

CPnu / y u 


Q OQ *5 *5 n 


890103 




CPnuvy l 


OjjUju 


890111 


R 


cpnu / y 2 


OQJ Q1 Q 


893 108 


R 


cpnu /?j 




894919 


R 


cpnu / y 4 


OQ71 n A 


O J O w w *m 




LrliU / ;3 


898128 


899195 




C r n U / y o 




9013 40 




U rTiU / j / 




902 6 94 


p 


tv*n moo 

cpnu / y ts 




903 8 56 


p 


Crnu / y y 


QflJ Qft 


903 940 


R 


qipnuouu 




905249 


R 


UJfnUo Ul 


Qflfl fiQ7 
3UODJ / 


906727 




C Sirs o ri *> 

CEnUoU2 




9087 09 


R 
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93 8267 


93843 4 


p 




939747 


938 827 






941129 


93 9747 




pDnfl ft 7, A 


941553 


942 014 


p 




945 689 


9 42045 




PPnf) fl 7. fi 


94 6 879 


945722 


R 




947771 


94714 5 




HPr-tH A 7 P. 


949106 


9477 81 


R 


*w Jr li u O J J 


94 9257 


950159 


p 




950222 


951544 


p 




951731 


954 64 0 


F 




954883 


9 54710 




CPn0843 


955191 


954994 


R 


CPn0844 


956730 


955270 


R 


CPn0845 


958079 


956S50 


R 


CPn0 84 6 


959374 


958112 


R 


CPn084 7 


959995 


959387 


R 


CPn0848 


961502 


960177 


R 


CPn084 9 


961788 


965285 


F 


CPnC850 


965293 


966390 


F 



papQ/amxB-N-Acetylmuramoyl-L-Ala Amidase- {CT60D 
pal-Peptidoglycan-Associated Lipoprotein- (CT600) 
tolB-polysaccharide transporter- {CT599) 
CT598 hypothetical protein 
exbD-Biopolymer Transport Protein- (CT5 97 ) 
exbB/tolQ-polysacchande transporter- (CT596) 
dsbD/xprA-Thio:disulfide Interchange Protein- {CT595 ) 
yabD/ycfH-PHP superfamily {urease/pyrimidinAse} hydrolase- (CT594 ) 
sdhC-Succinate Dehydrogenase- (CT5 93 i 
sdhA-Succinate Dehydrogenase- (CT592 ) 
sdhB-Succinate Dehydrogenase- (CT591) 
CT590 hypothetical protein 
CT589 hypothetical protein 

rbsU-sigma regulatory family protein-PP2C phosphatase (RsbW 
antagonist) - (CT588) 



eno-Enolase- (CT587) 

uvrB-Exinuc lease ABC Subunit B-{CT586) 
trpS-Tryptophanyl tRNA Synthetase- (CT58 5 ) 
CT584 hypothetical protein 
gp6D-CHLTR Plasmid Paralog- (CT583 ) 

minD - chromosome partitioning ATPa s e-CHL.TR plasmid protein GP5D-(CT582. 

thrS-Threonyl tRNA Synthetase- {CT5 81) 

CT580 hypothetical protein 

CT57 9 hypothetical protein 

CT578 hypothetical protein 

CT577 hypothetical protein 

lcrH_l-Low Ca Response Protein H_1-(CT576) 

mutL-DNA Mismatch Repair- (CT575 ) 

pepP-Anu.nopeptidase P-(CT574) 

CT573 hypothetical protein 

gspD/pilQ-Gen. Secretion Protein D-{CT572) 

gspE-Gen. Secretion Protein E-{CT571) 

gspF-Gen. Secretion Protein F-(CT570) 

predicted OMP (leader (16) peptide] - CCT5 69) 

CT568 hypothetical protein 

CT567 hypothetical protein 

CT5 66 hypothetical protein 

CT565 hypothetical protein 

yscT/spaR-YopT Tranlocation T-CCT564) 

yscS/f liQ-YopS/f liQ Translocation Protein- {CT563 ) 

yscR-Yop Translocation R-{CT562) 

yscL-Yop Translocation L-{CT561) 

CT560 hypothetical protein 

yscJ-Yop Translocation J-(CT559) 



lipA-Lipoate Synthetase- (CT5 58) 
IpdA-Lipoamide Dehydrogenase- (CT5 57) 
CT5 5 6 hypothetical protein 
motl_l-SWI7SNF family helicase„l- {CT555 ) 
brnQ-Amino Acid (Branched) Transport - { CT5 5 4 ) 
nth-Enodnuclease III-(CT697) 

thdF-Thiophene/Furan Oxidation Protein- (CT698) 
psdD-Phosphatidylserine Decarboxylase- (CT699) 
CT700 hypothetical protein 
secA_2-Translocase SecA_2- (CT701) 

CT702 hypothetical protein (frame-shift with 0843) 
CT702 hypothetical protein 
yphC-GTPase/GTP-binding protein- (CT703) 
pcnB_l-Poly A Polymerase_l- (CT7 04 ) 
clpX-CLP Protease ATPase- (CT705) 
clpP-CLP Protease Subunit- (CT706) 
tig/mur I -Trigger Faccor-peptidy 1 -prolyl 
motl_2-SWI/SKF family helicase_2 - (CT7 08 ) 
mreB-Rod Shape Protein-Sugar Kinase- (CT709 ) 



isomerase- (CT707) 
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CPn0851 


966396 


968195 


F 


CPn0852 


968316 


970613 


F 


CPn0853 


970637 


971803 


F 


CPn0 8 54 


972837 


971806 


R 


CPn0855 


973995 


972994 


R 


CPn0856 


975377 


973995 


R 


CPn0857 


975757 


975392 


R 


CPn0858 


977055 


975757 


R 


CPn0859 


977588 


977055 


R 


CPn0860 


978630 


977608 


R 


CPn0861 


979722 


978925 


R 


CPn0862 


980873 


979722 


R 


CPn0863 


981514 


980831 


R 


CPnQ864 


981670 


982374 


F 


CPn0865 


982418 


982942 


F 


CPn0866 


983491 


982916 


R 


CPn0867 


983423 


984667 


F 


CPn0868 


986643 


984670 


V 


CPn0869 


987401 


986658 


F. 


CPnO870 


988728 


987448 


F 


CPn0871 


988772 


989899 


F 


CPn0872 


989963 


991216 


F 


<|Pfc0873 


991233 


991694 


F 


Cfn0874 


993107 


991749 


F 


qgi0875 


993372 


994022 


F 


eSi087 6 


994144 


995517 


F 


c|Fii0877 


995533 


995982 


E 


C|FJi0878 


996654 


995992 


F 


<£PnQ87 9 


997439 


996645 


R 


CPJ10880 


999861 


997444 


R 


dpji0881 


1005667 


1006209 


F 


dili0882 


1006268 


1007404 


F 


CPn0883 


1008865 


1007573 


R 


qim0884 


1009359 


1009009 


R 


dPn0885 


1010635 


1009433 


R 


dPn0886 


1011276 


1010908 


R 


C$E&i0887 


1011692 


1014157 


F 


C|Pxl0888 


1015423 


1014119 


R* 


Cp£i0889 


1016835 


1015462 


R 


CM0890 


1017805 


1016819 


R 


C : ||08 91 


1021073 


1017819 


R 


CPn0892 


1023661 


1021046 


R 


CPn0893 


1023894 


1025888 


F 


CPnG894 


1026766 


1025888 


R 


CPn0895 


1026988 


1027557 


F 


CPn0896 


1027595 


1027822 


F 


CPn0897 


1028737 


1027853 


R 


CPn0898 


1030460 


1028904 


R 


CPn0899 


1030875 


1032215 


F 


CPn0900 


1032235 


1033281 


F 


CPn0901 


1033287 


1034537 


F 


CPn0902 


1034543 


1035241 - 


F 


CPn0903 


1035263 


1036417 


F 


CPn0904 


1036326 


1037396 


F 


CPn0905 


1037409 


1039835 


F 


CPn0906 


1040340 


1039915 


R 


CPn0907 


1040780 


1040445 


R 


CPn0908 


1041589 


1040780 


R 


CPn0909 


1041637 


1041966 


F 


CPn0910 


1041979 


1043004 


F 


CPn0911 


1044043 


1042985 


R 


CPn0912 


1044129 


1045750 


F 


CPn0913 


1045760 


1045945 


F 


CPn0914 


1045999 


1046397 


F 


CPn0915 


1046461 


1046817 


F 


CPn0916 


1046837 


1048084 


F 


CPn0917 


1048090 


1048539 


F 


CPn0918 


1049223 


1048579 


R 


CPn0919 


1049378 


1050430 


F 


CPn0920 


1051405 


1050431 


R 


CPn0921 


1C51535 


1052293 


F 



pckA-Phosphoenol pyruvate Carboxy kinase- (CT710) 

CT711 hypothetical protein 

CT712 hypothetical protein 

ompB-Outer Membrane Protein B-(CT713J 

gpdA -Glycerol -3 -P Dehydrogenase- (CT714} 

AgX-1 Homolog-UDP-Glucose Pyrophosphorylase- (CT715J 

CT716 hypothetical protein 

f lil-Flagellum-specif ic ATP Synthase- {CT717 ) 
CT718 hypothetical protein 
f liF-Flagellar M-Ring Protein- ( CT719 ) ■ 
mfU-NifU-related protein- (CT720) 
yfhO„2-NifS-related protein_2- {CT721) 
pgmA-Phosphogly cerate Mutase- (CT722J 
yjbC-predicted pseudouridine synthase- {CT72 3 } 
CT72 4 hypothetical protein 
birA-Biotm Synthetase- {CT72 5 ) 
rodA-Rod Shape Protein- (CT726 ) 

zntA/cadA-Metal Transport P-type ATPase- (CT727 ) 
CT72 8 hypothetical protein 
serS-Seryl tRNA Synthetase_2 - {CT72 9 ] 
ribD-Ribof lavm Deaminase- {CT73 0 ) 

ribA&ribB-GTF Cyclohydratase & DHBP Synthase -(CT731) 
ribE-Ribityllumazine Synthase- (CT732 ) 
CT733 hypothetical protein 
CT73 4 hypothetical protein 

dagA_2-D- Alanine /Glycine Permease_2- (CT735) 

ybcL family- (CT736) 

SET Domain protein- (CT737 ) 

yycJ-metal dependent hydrolase- {CT73 8 ) 

ftsK-Cell Division Protein FtsK-{CT739) 



dmp P / nqr 6 -Phenol hydro 1 as e/NADH ubiquinone oxi do reductase- {CT740) 
CT741 hypothetical protein 
ygcA-rPvNA Methyltransf erse- (CT742 ) 
hctA-Histone-Like Developmental Protein- (CT743 } 
CHLTR possible phosphoprotein- { CT744 ) 
hemG-protoporphyrinogen Oxidase- (CT74 5) 
hemN__2-Coproporphyrinogen III Oxidase_2- {CT74 6 ) 
hemE -Uroporphyrinogen Decarboxylase- (CT747) 
mfd-Transcription-Repair Coupling- {CT748) 
alaS-Alanyl tRNA Synthetase- (CT74 9 } 
tktB-Transketolase- (CT750) 
ainn-AMP Nucleosidase- {CT7 51 } 
efp_2 -Elongation Factor P_2- (CT752 ) 
CT7 53 hypothetical protein 
(possible phosphohydrolase) - {CT754 ) 
Mitochondrial HSP60 Chaperonin Homolog- {CT7 55} 
murF-Muramoyl-DAP Ligase- {CT756 ) 
mraY-Muramoyl - Pen tapep tide Transferase- (CT7 57) 
murD-Muramoylalanine-Glutamate Ligase- {CT758) 
nlpD-Muramidase (invasin repeat f amily) - (CT759 ) 
ftsW-Cell Division Protein FtsW-(CT760) 
murG-Peptidoglycan Transferase- (CT761) 

murC&ddlA-Muramate-Ala Ligase & D-Ala-D-Alam Ligase- (CT762 ) 
CT763 hypothetical protein 

*cutA Periplasmic Divalent Cation Tolerance Protein CutA (C-Type 

Cytochrome Biogenesis Protein) 
CT764 hypothetical protein 
rsbV_2-Sigma Factor Regulator_2 - {CT765 ) 
miaA-tRKA Pyrophosphate Transferase- (CT7 66) 
Fe-S cluster oxidoreductase_2- {CT767) 
CT768 hypothetical protein 



ybeB-iojap superfamily ortholog- {CT769 } 
fabF-Acyl Carrier Protein Synthase- (CT770 ) 
hydrolaseVphosphatase homolog- (CT771) 
ppa- Inorganic Pyrophosphatase- (CT772) 
Idh-Leucine Dehydrogenase- ( CT77 3 ) 

cysQ-Sulfite Synthes is /biphosphate phosphatase- (CT774 ) 
snGlycerol-3-P Acyl trans i erase- (CT775 ) 
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CPn092 2 


1052314 


1053927 


F 


aas -Acylclycerophosphoethanolamine AcyV. trans f erase- (CT776 ) 


CPn092 3 


1053984 


1055093 


F 


bioF_l-Oxononanoate Synthase..!- {CT777) 


CPn0924 


1057274 


1055028 


R 


pnA-Primosomal Protein N ' -(CT778) 


CPn0925 


1057900 


1057226 


R 


CT779 hypothetical protein 


CPn0 92 6 


1058060 


1058557 


F 


Thioredoxin Disulfide Isomerase- (CT780) 


CPn0927 


1059809 


1058670 


R 


*CHLPS 43 kDa protein homolog_2 


CPn0928 


1061008 


1059884 


R 


•CHLPS 43 kDa protein homolog_3 


CPn0929 


1062292 


1061186 


R 


*CHLPS 43 JcDa protein homolog_4 


CPn0930 


1062857 


1063330 


F 




CPn0931 


1064138 


1065718 


F 


lysS-Lysyl tRNA Synthetase- (CT781) 


CPn0932 


1067142 


1065721 


R 


cysS-Cysteinyl tRNA Synthetase- {CT782 ) 


CPn0933 


1067535 


1068578 


F 


predicted disulfide bond isomerase- (CT783 ) 


C?n0934 


1068942 


1068526 


R 


mpA-Ribonuclease P Protein Component -( CT7 8 4 ) 


CPn0935 


1069091 


1068957 


R 


rl34-L34 Ribosomal Protein- (CT785) 


CPn093 6 


1069336 


1069470 


F 


r!36-L36 Ribosomal Protein- (CT786 } 


CPnQ937 


.1069496 


1069798 


F 


rsl4-S14 Ribosomal Protein- (CT787) 


CPn0938 


1070322 


1069849 


R 


CT788 hypothetical protein -[leader (60) peptide-periplasmic J 


CPn0939 


1070728 


1071195 


F 


CT7 90 hypothetical protein 


CPn0940 


1073012 


1071204 


R 


uvrC-Exci nuclease ABC, Subunit C-(CT791) 


CPn0941 


1075501 


1073018 


R 


mutS-DNA Mismatch Repair- (CT7 92 ) 


CPn0942 


1075985 


1077754 


F 


dnaG/priM-DNA Primase- (CT794 ) 


CPn094 3 


1077978 


1078238 


F 


CT794.1 hypothetical protein 


CFff§944 


1078512 


1078997 


F 




CPSf 94 5 


1079070 


1079660 


F 


CT7 95 hypothetical protein 


CPxp)94 6 


1082786 


1079745 


R 


glyQ-Glycyl tRNA Synthetase- (CT796 } 


CPffS947 


1083442 


1084059 


F 


pgsA_2 -Glycerol -3 -P-Phosphatydyl trans ferase_2 - (CT797) 


CE4i0948 


1085474 


1084047 


R 


glgA-Glycogen Synthase- (CT7 98J 


CPff§949 


1085929 


1086483 


F 


ctc-General Stress Protein- (CT799 } 


CPnQ950 


1086488 


1087027 


F 


pth-Peptidyl tRNA Hydrolase- (CT800 ) 


CPn5951 


1087122 


1087457 


F 


rs6-S6 Ribosomal Protein- {CT801 ) 


CPHg952 


1087478 


1087723 


F 


rs!8-S18 Ribosomal Protein- (CT802 ) 


CPHd953 


1087742 


1088248 


F 


r!9-L9 Ribosomal Protein- {CT803 ) 


CPn0954 


1088286 


1088708 


F 


ychB-Predicted Kinase- {CT804 ) 


CPnJ,955 


1088612 


1089175 


F 


(frame-shift with 0954) 


Cpfi0956 


1089560 


1090909 


F 


CT805 hypothetical protein 


CPn0957 


1093788 


1090963 


R 


ide/ptr-Insulinase family/Protease III-{CT806) 


CPjh§958 


1094785 


1093793 


R 


plsB-Glycerol-3-P Acyl transferase- (CT807) 


CP&S959 


1096343 


1094799 


R* 


cafE-Axial Filament Protein- {CT808 ) 


CPn&960 


1096764 


1097102 


F 


CT809 hypothetical protein * 


CPnli961 


1097118 


1097297 


F 


rl32-L32 Ribosomal Protein- {CT810 ) 


CPfl39 62 


1097316 


1098275 


F 


plsX-FA/Phospholipid Synthesis Protein- (CT811 ) 


CPn0963 


1098398 


1103224 


F 


prnp_21- Polymorphic Outer Membrane Protein D Family- (CT812) 


CPn0964 


1104758 


1103301 


R 




CPn0965 


1106736 


1104925 


R 


lpxB-Lipid A Disaccharide Synthase- {CT4 11) 


CPn0966 


1108037 


1106748 


R 


pcnB_2-PolyA Polymerase_2- (CT410) 


CPn0967 


1108512 


1109885 


F 


mrsA/pgm-Phosphoglucomutase- {CT815} 


CPn0968 


1109895 


1111721 


F 


glmS-Glucosamine-Fructose-6-P Aminotransferase- (CT816) 


CPn0969 


1111812 


1112999 


F 


0 96 9 -tyrP_l -Tyrosine Transport_l- (CT817) tyrP_l -Tyrosine Transport_l- 










(CT817) 


CPn0970" 


1113461 


1114648 


F 


0 97 0-tyrP_2 -Tyrosine Transport_2- (CT818 ) tyrP_2 -Tyros ine Transports - 










(CT818) 


CPn0971 


1114702 


1115415 ' 


F 


yccA-Transport Permease- (CT81 9 ) 


CPn0972 


1116299 


1115430 


R 


ftsY-Ceil Division Protein FtsY-{CT820) 


CPn0973 


1116370 


1117527 


F 


sucC-Succinyl-CoA Synthetase, Beta-(CT821} 


CPn0974 


1117544 


1118422 


F 


sucD-Succinyl-CoA Synthetase, Alpha- (CT822 J 


CPn097 5 


1119104 


1119637 


F 




CPn0976 


1120082 


1121185 


F 




CPn0977 


1121371 


1122402 


F 




CPn0978 


1122665 


1123693 


F 




CPn0979 


1123980 


1125443 


F 


htrA-DO Serine Protease- (CT823 ) 


CPn0980 


1126982 


1125504 


R 


•similarity to Saccharomyces serevisiae hypothetical 52.9KD protein 


CPn0981 


1127031 


1129952 


F 


Zinc Metalloprotease (insulinase f amily ) - (CT824) 


CPn0982 


1131194 


1129962 


R 


yigN family- (CT825 ) 


CPn0983 


1132000 


1131206 


R 


pssA-Glycerol-Serine Phosphatidyl transferase- (CT826) 


CPn0984 


1132379 


1135510 


F 


nrdA-Ribonucleoside Reductase, Large Chain- (CT827 ) 


CPn0985 


1135534 


1136571 


F 


nrdB-Ribonucleoside Reductase, Small Chain- (CT828 ) 


CPn0986 


1136724 


1137395 


F 


yggH-predicted rRNA Methy lase- { CT829 ) 


CPn0987 


1137516 


1138115 


F 


ytgB-like predicted rRNA methy lase- (CT830 ) 


CPn0988 


1138986 


1138075 


R 


murB-UDP-N-Acetylenolpyruvoylglucosamine Reductase- (CT831) 


CPnOS89 


1139495 


1139016 


R 


CT832 hypothetical protein 


CPn0990 


1139883 


1140440 


F 


inf C-Initiation Factor 3-(CT833) 


CPn0991 


1140421 


1140612 


F 


r!35-L35 Ribosomal Protein- < CT834 ) 
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C?n0992 


1140634 


1140996 


F 


C?n0993 


1141014 


1142030 


F 


CPn0994 


1142398 


1144440 


F 


C?n0995 


1145512 


1144415 


R 


CPn0996 


1146589 


1145519 


R 


CPn0997 


1146708 


1147664 


F 


CPn0998 


1147855 


1150584 


F 


CPn0999 


1152847 


1150766 


R 


CPnlOOO 


1153157 


1152891 


R 


CPnlOOl 


1153405 


1153869 


F 


CPnlO02 


1153862 


1154089 


F 


CPnl003 


1154796 


1154092 


R 


CPnl004 


1155397 


1154879 


R 


CEnlOOS 


1155933 


1155415 


R 


CPnl006 


1156472 


1155990 


R 


CPnl007 


1156689 


1156907 


F 


CPnl008 


1156928 


1158223 


F 


CPnl009 


1159058 


1158186 


R 


CPnlOlO 


1159672 


1159067 


R 


CPnlOll 


1160306 


1159902 


R 


CPnl012 


1162193 


1160421 


R 


CPnl ; Q13 


1162245 


1163624 


F 


CPi4lil4 


1165426 


1163732 


R 


CPrt|j>15 


1165634 


1166893 


F 


CPnJ£pi6 


1167042 


1168898 


F 


CPni017 


1169006 


1169935 


F 


C?r|i;6l8 


1169898 


1170629 


F 


CPifMl9 


1172128 


1170638 


R 


CPifeiD20 


1173679 


1172150 


R 


CPril™|)21 


1174213 


1173698 


R 


CPnip22 


1175673 


1174216 


R 


CPnl623 


1176035 


1176331 


F 


CPrfl024 


1177236 


1176334 


R 


CPrSyb25 


1177302 


1178879 


F 




1178997 


1179137 


F 


CPn" ; l : 027 


1179175 


1180755 


F 


CPrff02 8 


1181016 


1181999 


F 


CPiM>29 


1182008 


1182844 


F* 


CPrfefp3 0 


1183886 


1182843 


R 


CPn3?p31 


1185552 


1184098 


R 


CPnT03 2 


1186150 


1185566 


R 


CPnl033 


1187500 


1186187 


R 


CPnl034 


1188517 


1187732 


R 


CPnl035 


1190000 


1188570 


R 


CPnl036 


1191135 


1189984 


R 


CPnl037 


1192199 


1191123 


R 


CPnl038 


1192726 


1192199 


R 


CPnl039 


1193999 


1192665 


R 


CPnl040 


1194741 


1194073 


R 


CPnl041 


1195994 


1194726 


R 


CPnl042 


J.196590 


1195934 


R 


CPnl043 


1197717 


1196572 • 


R 


CPnl044 


1198691 


1197699 


R 


CPnl045 


1199590 


1198901 


R 


CPnl046 


1200675 


1199590 


R 


CPnl047 


1200552 


1201343 


F 


CPnl048 


1201606 


1202604 


F 


CPnl049 


1202595 


1203914 


F 


CPnlOSO 


1203926 


1204798 


F 


CPnlOSl 


1204962 


1205270 


F 


■ CPnl052 


1205417 


1206169 


F 


CPnI053 


1206153 


1206701 


F 


CPnl054 


1207034 


1209466 


F 


CPnl055 


1209694 


1210521 


F 


CPnl056 


1210527 


1211228 


F 


CPnl057 


1211497 


1213596 


F 


CPnlOSS 


1213748 


1214836 


F 


CPnl059 


1214848 


1215678 


F 


CPnl060 


1217658 


1215727 


R 


CPnl061 


1217920 


1217666 


R 


CPnl062 


1219820 


1218159 


R 


C?nl063 


1219951 


1220712 


F 



ri20-L20 Ribosomal Protein- {CT835 ) 

pheS-Phenylalanyl tRNA Synthetase, Alpha- (CT836) 

CTB37 hypothetical protein 

CT8 3 8 hypothetical protein 

CT83 9 hypothetical protein 

mesJ-PP-loop superfamily ATPase- (CT840) 

f tsH-ATP-dependent zmc protease- (CT841 ) 

pnp-Polyribonucleotide Nucleotidyltransferase- {CT842} 

rsl5-S15 Ribosomal Protein- (CT843 ) 

y fhC-cytosine deaminase- (CT844 } 

CT845 hypothetical protein 

CT84 6 hypothetical protein 

CT847 hypothetical protein 

CT848 hypothetical protein 

CT84 9 hypothetical protein 

CT84 9.1 hypothetical protein 

CT850 hypothetical protein 

map-Methionine Aminopeptidase- (CT8 51} 

CT852 hypothetical protein 

CT853 hypothetical protein 

yzeB-ABC transporter permease- (CT854) 

fumC-Fumarate Hydratase- {CT855 ) 

ychM-Sulfate Transporter- {CT85 6 ) 

CT857 hypothetical protein {possible IM protein) 

CT858 hypothetical protein 

lytB-Metalloprotease- (CT859) 

CT860 hypothetical protein 
CT861 hypothetical protein 
lcrH_2-Low Calcium Response^- (CT862 ) 
CT863 hypothetical protein 

xerD-Integrase/recombinase- {CT8 64 ) 
pgi-Glucose-6-P Isomerase- (CT37 8) 
ltuA-{CT377) 

mdhC-Malate Dehyrogenase- {CT37 6} 

predicted D-amino acid dehyrogenase- (CT375 ) 
arcD-Axginine /Ornithine Antiporter- (CT374 ) 
CT373 hypothetical protein 
CT37 2 hypothetical protein 

Predicted OMP_l (CT371) [leader (18) peptide] 
AroE-Shikimate 5 -Dehyrogenase- (CT370) 
AroB-Dehyroquinate Synthase- {CT36 9 ) 
AroC-Chorismate Synthase- (CT368) 
aroL-Shikimate Kinase II-(CT367) 
aroA-Phosphoshikimate Vinyltransf erase- (CT366) 

*bioA-Adenosylmethionine-8-Amino-7-Oxononanoate Aminotransferase 
*bioD-dethiobiotin synthetase 
bioF„2-Oxononanoate Synthase_2 
*bioB-Biotin Synthase 

•conserved hypothetical bacterial membrane protein 
♦Tryptophan Hyroxylase 

dapB-Dihydrodipicolinate Reductase- (CT364) 
asd-Aspartate Dehydrogenase- (CT3 63) 
lysC-Aspartokinase III-{CT362) 
dapA-Dihydrodipicolinate Synthase- (CT361) 



CT3 56 hypothetical protein 
CT355 hypothetical protein 
kgsA-Dimethyladenosine Transferase- {CT354) 
dxs/tkt-Transketolase- {CT331) 
CT330 hypothetical protein 
xseA-Exodoxyr j.bonuclease VII- (CT329 ) 
tpiS-Triosephosphate Isomerase- (CT328 ) 
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CPnl064 


1220719 


1220895 


F 


CPnl065 


1221095 


1220928 


R 


CPnl066 


1221135 


1221488 


F 


CPnl067 


1221735 


1222292 


F 


CPnl068 


1223258 


1222365 


R 


CPnl069 


1223513 


1223941 


F 


CPnl070 


1225511 


1224144 


R 


CPnl071 


1227324 


1225885 


R 


CPnl072 


1227969 


1228835 


F 


CPnl073 


1229011 


1229832 


F 



def- Polypeptide Deforrnylase- (CT353 ) 
rahB_2-Ribonuclease HII_2- (CT008 ) 
yfgA-HTH Transcriptional Regulator- CCT009) 



Predicted OMP_2 -(CT371) 
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Table 2 (Supplemental Data) Functional Assignments of C pneumoniae Coding Sequences C. trachomatis genes are shown m 
parentheses 



Amino Acid Biosynthesis 



Aromatic Family 



10 



15 



20 



25 



30 



35 



40 



1039 


(CT366) 


aroA 


Phosphoshtkimate V in yi transferase 


1036 


(CT369) 


aroB 


Dehyroquinate Synthase 


1037 


(CT368) 


aroC 


Chonsmate Synthase 


1035 


(CT370) 


aroE 


Shikimate 5-Dehyrogenase 


0484 


(CT382) 


aroG 


Deoxyheptonate Aldolase 


1038 


(CT367) 


aroL 


Shikimate Kinase II 


0740 


(CT637) 


tyrB 


Aroi ian.c A A Aminotransferase 


\par late Family (lysine) 




1048 


(CT363) 


asd 


AspE'tate Dehydrogenase 


1050 


(CT361) 


dap A 


Dih^tirodipicohnate Synthase 


1047 


(CT364) 


dapB 


Dihydrodipicohnate Reductase 


0519 


(CT430) 


dapF 


Dian inopimelate Epimerase 


1049 


(CT362) 


lysC 


Aspa tokinase III 



Serine Family 
0433 (CT282) 
0521 (CT432) 



gcsH 
glyA 



Glyc ne Cleavage System H Protein 
Serin*: Hydroxy me thy I transferase 



Base & Nucleotide Metabolism 



0171 




guaA 


GMP Synthase 


0172 




guaB 


Inosine 5'-Monophosphase Dehydrogenase 


0608 






Undine 5"-Monophosphate Synthase 


0735 






Undine Kinase 


0244 


(CTI23) 


adk 


Adenylate Kinase 


0894 


(CT751) 


amn 


AMP Nucleosidase 


0568 


(CT452) 


cmk 


CMP Kinase 


0392 


(CT039) 


dcd 


dCTP Deaminase 


0059 


(CT292) 


dut 


dUTP Nucieotido hydrolase 


0120 


(CT030) 


gmk 


GMP Kinase 


0619 


(CT500) 


ndk 


Nucleoside-2-P Kinase 


0984 


(CT827) 


nrdA 


Ribonucleoside Reductase, Large Chain 


0985 


(CT828) 


nrdB 


Ribonucieoside Reductase, Small Chain 


0236 


(CT183) 


pyrG 


CTP Synthetase 


0698 


(CT678) 


pyrH 


UMP Kinase 


0273 


(CT188) 


tdk 


Thymidylate Kinase 


0659 


(CT539) 


trxA 


Thioredoxm 


0314 


(CT099) 


trxB 


Thioredoxin Reductase 


1001 


(CT844) 


yfhC 


Cytosme Deaminase 
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Biosynthesis of Cofactors 



Biotin. Lipoate & Ubiquinone 



1041 




bioA 


AdenosyIrnethionme-8-Ammo-7-Oxononanoate Aminotransferase 


1044 




bioB 


Biotm Synthase 


1042 




bioD 


Dethiobiotin Synthetase 


0923 


(CT777) 


bioFj 


Oxononanoate Synthase_l 


1043 


(CT777) 


bioF_2 


Oxononanoate Synthase_2 


0866 


(CT725) 


birA 


Biotm Synthetase 


0748 


(CT628) 


ispA 


Geranyl Transtransferase 


0832 


(CT558) 


hpA 


Lipoate Synthetase 
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0265 


(CT219) 


tibiA 


Benzoate Octaphenyi transferase 


0264 


(CT220) 


ubiD 


Phenyl aery late Decarboxylase 


0515 


(CT428) 


ubiE 


Ubiquinone Methyl transferase 


Folic Acid 






0759 


(CT6I2) 


folA 


Dihydrofolate Reductase 


0335 


(CT078) 


foID 


Methylene Tetrahydrofolate Dehydrogenase 


0758 


(CT6I3) 


foIP 


Dihydropteroate Synthase 


0757 


(CT614) 


folX 


Dihydroneoptenn Aldolase 


0763 


(CT649) 


ygfA 


Formyltetra hydro folate Cyclohgase 


Porphyrin 








0714 


(CT662) 


hemA 


Glutamyl tRNA Reductase 


0744 


(CT633) 


hemB 


Porphobilinogen Synthase 


0052 


(CT299) 


hemC 


Porphobilinogen Deaminase 


0890 


(CT747) 


hem£ 


Uroporphyrinogen Decarboxylase 


0888 


(CT745) 


hemG 


proto porphyrinogen Oxidase 


0138 


(CT210) 


hemL 


Glutamate- 1 -Semialdehyde-2, 1 - Ammomutase 


0380 


(CT052) 


hemNJ 


Coproporphynnogen III Oxidase_I 


0889 


(CT746) 


hemN_2 


Coproporphynnogen III Oxidase_2 


0603 


(CT485) 


hemZ 


Ferrochetalase 


Riboflavin 








0872 


(CT73I) 


nbA&nbB GTP Cyclohydratase Sc. DHBP Synthase 


0532 


(CT405) 


nbC 


Riboflavin Synthase 


0871 


(CT730) 


nbD 


Riboflavin Deaminase 


0873 


(CT732) 


nbE 


Ribiryllumazme Synthase 


0320 


(CT093) 


nbF 


FAD Synthase 



Cell Envelope 

Fatty Acid & Phospholipid Metabolism 



0161 


(CT206) 




{predicted acyltransferase family) 


0922 


(CT776) 


aas 


Acylglycerophosphoethanolarmne Acyltransferase 


0414 


(CT265) 


accA 


AcCoA Carboxylase/Transferase Alpha 


0183 


(CT123) 


accB 


Biotin Carboxyl Carrier Protein 


0182 


(CT124) 


accC 


Biotm Carboxylase 


0058 


(CT293) 


accD 


AcCoA Carboxyiase/Transferase Beta 


0295 


(CT236) 


acpP 


Acyl Camer Protein 


0313 


(CT100) 


acpS 


Acyi-camer Protein Synthase 


0567 


(CT451) 


cdsA 


Phosphatidate Cytidyly transferase 


0297 


(CT238) 


fabD 


Malonyl Acyi Camer Transcyclase 


0916 


(CT770) 


fabF 


Acyl Camer Protein Synthase 


0296 


(CT237) 


fabG 


Oxoacyl (Camer Protein) Reductase 


0298 


{CT239) 


fabH 


Oxoacyl Camer Protem Synthase III 


0406 


(CT104) 


fabl 


Enoyl-Acyl-Camer Protein Reductase 


0651 


(CT532) 


fabZ 


Mynstoyl-Acyl Camer Dehydratase 


0098 


(CTOI0) 


htrB 


Acyltransferase 


0271 


(CT1 36) 




Lysophosp ho lipase Esterase 


0615 


(CT496) 


pgsA_I 


Glycerol-3-P Phosphatidyltransferaset 


0947 


(CT797) 


pgsA_2 


Glycerol-3-P Phosphatydyl trans ferase_2 


0958 


(CT807) 


plsB 


Glycerol-3-P Acyltransferase 


0569 


(CT453) 


pisC 


Glycerol-3-P Acyltransferase 


0962 


(CT811) 


pIsX 


FA/Phosphohpid Synthesis Protem 


0839 


(CT699) 


psdD 


Phosphatidylsenne Decarboxylase 


0983 


(CT826) 


pssA 


G 1 ycerol - Sen ne Phosp hatidy ltra nsferase 


0921 


(CT775) 




snGlycerol-3-P Acyltransferase 


0654 


(CT535) 


yciA 


Acyl-CoA Thioesterase 


0877 


(CT736) 


ybcL 


CT736 Hypothetical Protein 



LPS 



0154 


(CT208) 


gseA 


KDO Transferase 


0721 


(CT655) 


kdsA 


KJDO Synthetase 


0235 


(CT182) 


kdsB 


Deoxyoctulonosic Acid Synthetase 


0650 


(CT531) 


IpxA 


Acyl-Camer UDP-GlcNAc O-Acyltransferase 


0965 


(CT411) 


IpxB 


Lipid A Disaccharide Synthase 


0652 


(CT533) 


IpxC 


Mynstoyl GlcNac Deacetylase 


0302 


(CT243) 


lpxD 


UDP Glucosamine N -Ac yl transferase 


embrane Proteins, Lipoproteins & Porins 


0310 


(CT251) 


60IM 


60kDa Inner Membrane Protein 


0556 


(CT442) 


crpA 


15kDa Cysteme-Rich Protein 


0653 


(CT534) 


cut£ 


Apohpoprotem N-Acetyltransferase 


031 1 


(CT252) 


Igt 


Prohpoprotem Dtacylglycerol Transferase 


0558 


(CT444) 


omcA 


9kDa-Cysteine-Rich Lipoprotein 


0557 


(CT443) 


omcB 


60kDa Cysteme-Rich OMP 


0695 


(CT681) 


ompA 


Major Outer Membrane Protein 


0854 


(CT713) 


ompB 


Outer Membrane Protein B 


0781 


(CT600) 


pal 


Peptidoglycan-Associated Lipoprotein 


0300 


(CT24I) 


yaeT 


Omp85 Homolog 


'ptidogiycan 






0417 


(CT268) 


amiA 


N-Acetylmuramoyl Alanine Amidase 


0730 


(CT601) 


amiB 


N-Acetylmuramoyl-L-Ala Amidase 


0672 


(CT551) 


dacF 


D-Ala-D-Ala Caroxypeptidase 


0968 


{CT816) 


gimS 


GIucosamine-Fructose-6-P Aminotransferase 


0749 


(CT629) 


glmU 


UDP-GlcNAc Pyrophosphorylase 


0900 


(CT757) 


mraY 


Muramoyl-Pentapeptide Transferase 


0571 


(CT455) 


murA 


UDP-N-Acetylglucosamine Transferase 


0988 


(CTS31) 


murB 


UDP-N-Acetylenolpyruvoylglucosamme Reductase 


0905 


(CT762) 


murC&ddlA Muramate-Ala Ligase & D-Ala-D-Alam Ligase 


0901 


(CT758) 


murD 


Muramoylalamne-Glutamate Ligase 


0418 


(CT269) 


murE 


N-Acetylmuramoylalanylglutamyl DAP Ligase 


0899 


(CT756) 


murF 


Muramoyi-DAP Ligase 


0904 


(CT761) 


murG 


Peptidoglycan Transferase 


0902 


(CT759) 


nlpD 


Muramidase (mvasm repeat family) 


0694 


(CT682) 


pbp2 


PBP2-Transglycolase/Transpepadase 


0419 


(CT270) 


pbp3 


Transglycolase/Transpeptidase 


0421 


(CT272) 


yabC 


PBP2B Family Methyl transferase 



Ceil Division 

0959 (CT808) 
0880 (CT739) 
0903 (CT760) 
0972 (CT820) 
0617 (CT498) 
0805 (CT582) 
0850 (CT709) 
0867 (CT726) 
0684 (CT688) 

Detoxtification 
0057 (CT294) 
0778 (CT603) 

Signal Transduction 
0148 (CT145) 
0584 (CT467) 
0294 (CT235) 
0712 (CT664) 



Cellular Processes 

cafE Axial Filament Protein 

ftsK Cell Division Pro tern FtsK 

ftsW Ceil Division Protein FtsW 

ftsY Cell Division Protein FtsY 

gidA FAD-dependent Oxidoreductase 

minD Chromosome Partitioning ATPase 

mreB Rod Shape Protein-Sugar Kinase 

rodA Rod Shape Protein 

parB Chromosome Partitioning Protein 

sodM Superoxide Dismutasc (Mn) 

ahpC Thio-specific Antioxidant (TSA) Peroxidase 

S/T Protein Kinase 
atoS Two-Component Sensor 

CAM P- Dependent Protein Kinase Regulatory Subunit 
(FHA domain) 
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0478 


(CT379) 


hflX 


GTP Binding Protein 


0703 


(CT673) 




S/T Protein Kinase 


0095 


(CT301) 




S/T Protein Kinase 


0397 


(CT259) 




PP2C Phosphatase Family 


0037 


(CT337) 


ptsH 


PTS Phosphocamer Protein Hpr 


0038 


(CT336) 


ptsi 


PTS PEP Phosphotransferase 


0060 


(CT291) 


ptsNJ 


PTS II A Protem_t 


0061 


(CT290) 


ptsN_2 


PTS HA Protein + HTH DNA-Bmding Domain 


0262 


(CT218) 


surE 


SurE-Uke Acid Phosphatase 


0838 


(CT698) 


thdF 


Thiophene/Furan Oxidation Protein 


0693 


(CT683) 




TPR Repeats- CT6 8 3 Hypothetical Protein 


0321 


(CT092) 


ychF 


GTP Binding Protein 


0544 


(CT418) 


yhbZ 


GTP binding protein 


0844 


(CT703) 


yphC 


GTPase/GTP-bmdmg protein 


andard Protein Secretion 




0115 


(CT025) 


ffh 


Signal Recognition Particle GTPase 


0363 


(CT060) 


flhA 


Flagellar Secretion Protein 


0858 


(CT717) 


flil 


Flagellum-specific ATP Synthase 


0704 


(CT672) 


fliN 


Flagellar Motor Switch Domain/YscQ family 


0815 


(CT572) 


gspD 


Gen. Secretion Protein D 


0816 


(CT571) 


gspE 


Gen. Secretion Protein E 


0817 


(CT570) 


gspF 


Gen. Secretion Protein F 


0359 


(CT064) 


lepA 


GTPase 


0110 


(CT020) 


lepB 


Signal Peptidase I 


0535 


(CT408) 


IspA 


Lipoprotein Signal Peptidase 


0260 


(CT141) 


secA_l 


Protein Translocase Subiinit_l 


0841 


(CT701) 


secA_2 


Translocase SecA_2 


0564 


(CT448) 


secD&secF Protein Export Proteins SecD/SecF (fusion) 


0075 


(CT321) 


secE 


Preprotein Translocase 


0629 


(CT510) 


secY 


Translocase 


0848 


(CT707) 




Trigger Factor-Peptidyi-prolyl Isomerase 


■ansparl-Related Proteins 




0486 






Hypothetical Proline Permease 


0289 


(CT230) 


aaaT 


Neutral Ammo Acid (Glutamate) Transporter 


0691 


(CT685) 


abcX 


ABC Transporter ATPase 


1031 


(CT374) 


arcD 


Arginine/Omithtne Anhporter 


0482 


(CT381) 


artj 


Argmine Penplasrmc Binding Protein 


0836 


(CT554) 


bmQ 


Ammo Acid (Branched) Transport 


0536 


(CT409) 


dagA_l 


D-Ala/Gly Permease_l 


0876 


(CT735) 


dagA_2 


D-Alanme/Glycine Permease^ 


0682 


(CT690) 


dppD 


ABC ATPase Dipepnde Transport 


0683 


(CT689) 


dppF 


ABC ATPase Dipeptide Transport 


0280 


(CT689) 


dppF 


Dipeptide Transporter ATPase 


0785 


(CT596) 


exbB 


Macromolecule Transporter 


0784 


(CT597) 


exbD 


Biopolymer Transport Protein 


0604 


(CT486) 


fliY 


Glutamme Binding Protein 


0192 


(CT129) 


glnP 


ABC Amino Acid Transporter Permease 


0191 


(CT130) 


glnQ 


ABC Ammo Acid Transporter ATPase 


0528 


(CT401) 


gltT 


Glutamate Symport 


0286 


(CT194) 


mgiE 


Mg ++ Transporter (CBS Domain) 


0413 


(CT264) 


msbA 


Transport ATP Binding Protein 


0290 


(CT231) 




Na + -dependent Transporter 


0195 


(CT198) 


oppA_l 


Oligopeptide Binding Protein_l 


0196 


(CT198) 


oppA_2 


Oligopeptide Binding Protein_2 


0197 


(CTI39) 


oppA_3 


Oligopeptide Binding Protein_3 


0198 


(CT175) 


oppA_4 


Oligopeptide Binding Protem 4 
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0599 


(CT480) 


oppAj 


0199 


(CT199) 


oppBJ 


0598 


(CT479) 


oppBJ 


0200 


(CT200) 


oppCJ 


0597 


(CT478) 


oppCJ 


0201 


(CT201) 


oppD 


0202 


(CT202) 


oppF 


0231 


(CT180) 


tauB 


0782 


(CT599) 


tolB 


0969 


(CT817) 


tyrPJ 


0970 


(CT8.18) 


ryrP_2 


0665 


(CT544) 


uhpC 


0282 


(CT216) 


xasA 


0207 


(CT204) 


ybhl 


0971 


(CT819) 


yccA 


0248 


(CT152) 


ycfV 


1014 


(CT856) 


ychM 


0736 


(CT641) 


ygeD 


0680 


(CT692) 


ygo4 


0723 


(CT653) 


yhbG 


0023 


(CT348) 


yjjtc 


0127 


(CT034) 


ytfF 


0349 


(CT067) 


ytgA 


0348 


(CT068) 


ytgB 


0347 


(CT069) 


ytgC 


0346 


(CT070) 


ytgD 


1012 


(CT854) 


yzeB 


0868 


(CT727) 


zntA 


0279 






0543 


(CT417) 




0692 


(CT684) 




0542 


(CT416) 




0690 


(CT686) 




0541 


(CT415) 




vpe-III Secretion 




0323 


(CT090) 


IcrD 


0324 


(CT089) 


IcrE 


081 1 


(CT576) 


lcrH_l 


1021 


(CT862) 


lcrH_2 


0325 


(CT088) 


sycE 


0702 


(CT674) 


yscC 


0828 


(CT559) 


yscJ 


0826 


(CT561) 


yscL 


0707 


(CT669) 


yscN 


0825 


(CT562) 


yscR 


0824 


(CT563) 


yscS 


0823 


(CT564) 


yscT 


0322 


(CT09 1 ) 


yscU 


Uycogen Metabolism 




0856 


(CT715) 




0948 


(CT798) 


glgA 


0475 


(CT866) 




0607 


(CT489) 


gigc 


0307 


(CT248) 




0388 


(CT042) 





Oligopeptide Binding Lipoprotein 5 

Oligopeptide Permease J 

Oligopeptide Permease_2 

Oligopeptide Permease ! 

Oligopeptide Permease_2 

Oligopeptide Transport ATPase 

Oligopeptide Transport ATPase 

ABC Transport ATPase CNitrate/Fe) 

Macromolecule Transporter 

Tyrosine Transportl 

Tyrosine Transport_2 

Hexosphosphate Transport 

Ammo Acid Transporter 

dicarooxylate Translocator 

Transport Permease 

ABC Transporter ATPase 

Sulfa i: Transporter 

Efflu. Protein 

Phospiate Permease 

ABC Transporter ATPase 

ABC Transporter Protem ATPase 

Catio- ic Ammo Acid Transporter 

Solut* Protein Binding Family 

ABC Tansporter ATPase 

Integral Membrane Protein 

Integral Membrane Protein 

ABC Transporter Permease 

Metal Transport P-type ATPase 

Possible ABC Transporter Permease Protein 

(Metal Transport Protein) 

ABC Transporter 

ABC Transporter ATPase 

ABC Transporter Membrane Protem 

solute binding protein 

Low Calcium Response D 

Low Calcium Response E 

Low Ca Response Protein H_l 

Low Calcium Response_2 

Secretion Chaperone 

Yop C/Gen Secretion Protem D 

Yop Translocation J 

Yop Translocation L 

Yop N (Flagellar-Type ATPase) 

Yop Translocation R 

YopS Translocation Protein 

YopT Tranlocation T 

Yop Translocation Protein U 

Central Intermediary Metabolism 

UDP-GIucose Pyrophosphorylase 
Glycogen Synthase 
Glucan Branching Enzyme 
GIucose-1-P Adeny I transferase 
Glycogen Phosphorylase 
Glycogen Hydrolase (debranching) 
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0326 (CT087) malQ 

0851 (CT710) pckA 

Phosphorous & Sulfur 

0543 (CT435) cysJ 

0920 (CT774) cysQ 

0025 (CT346) atsA 

0918 (CT772) ppa 



Glucanotransferase 
Phosphoenolpyruvate Carboxykmase 

Sulfite Reductase 

Sulfite Synthesis/Bi phosphate Phosphatase 

Sulphohydrolase 

Inorganic Pyrophosphatase 



10 



15 



20 



25 



30 



35 



40 



45 



50 
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DNA Mismatch Repair 
0505 

0812 (CT575; 

0941 (CT792; 
0402 (CT107; 
0732 (CT625; 
0837 (CT697; 

DNA Modification 

0596 (CT477 

0114 (CT024; 

0891 (CT748; 

0620 (CT501 
0390 (CT040] 

0621 (CT502; 
0053 (CT298: 
0773 (CT607; 
1062 (CT329; 

DMA Recombination 

0762 (CT650; 

0738 (CT639; 

0737 (CT640] 

0123 (CT033; 

0752 (CT652; 

0339 (CT074; 

0340 (CT074; 
0563 (CT447; 
0299 (CT240; 

DNA Replication 

0309 (CT250; 

0424 (CT275; 

0616 (CT497: 

0666 (CT545; 

0942 (CT794; 
0338 (CT075 
0410 (CT261; 
0655 (CT536; 
0040 (CT334; 
0272 (CT187 
0149 (CT146; 

0274 (CTI89; 
0716 (CT660: 

0275 (CT190; 
0715 (CT661 
0416 (CT267; 
0612 (CT493; 
0924 (CT778; 
0386 (CT044; 



DNA Replication, Modification, Repair & Recombination 

3-MethyIadenme DNA Glycosylase 

mutL DNA Mismatch Repair 

mutS DNA Mismatch Repair 

mutY Adenine Glycosylase 

nfo Endonuc lease IV 

nth Enodnuclease III 

ada Methyltransferase 

hemK A/G-specific Methylase 

mfd Transcription- Repair Coupling 

ruvA Holhday Junction Hehcase 

ruvB Holliday Junction Hehcase 

ruvC Crossover Junction Endonuclease 

sms Sms Protein 

ung Uracil DNA Glycosylase 

xseA Exodoxynbonuc lease VII 

recA RecA Recombination Protein 

recB Exodeoxynbonuclease V, Beta 

recC Exodeoxynbonuclease V, Gamma 

recD_l Exodeoxynbonuclease V (Alpha Subumt)_l 

recD_2 Exodeoxynbonuclease V, Alpha_2 

recF ABC Superfamily ATPase 

(frame-shift with 0339) 

recJ ssDNA Exonuclease 

recR Recombination Protem 

dnaA_l Replication Initiation Protein_l 

dnaA_2 Replication Initiation Factor_2 

dnaB Repltcattve DNA Hehcase 

dnaE DNA Pol III Alpha 

dnaG DNA Pnmase 

dnaN DNA Pol III (Beta) 

dnaQ_l DNA Pol III Epsilon Cham_l 

dnaQ_2 DNA Pol III Epsilon Chain_2 

dnaX_l DNA Pol III Gamma and Tau_l 

dnaX_2 DNA Pol III Gamma and Tau_2 

dnlJ DNA Ligase 

gyrA_l DNA Gyrase Subunit A_l 

gyrA_2 DNA Gyrase Subunit A_2 

gyrB_I DNA Gyrase Subunit B_l 

gyrB_2 DNA Gyrase Subunit B_2 

himD Integration Host Factor Alpha 

po I A DNA Polymerase I 

pnA Pnmosomal Protem N* 

ssb SS DNA Binding Protein 
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0835 (CT555) SWl/SNF family helicase J 

0849 (CT708) SWl/SNF family hehcase_2 

0769 (CT643) topA DNA Topoisomerase r-Fused to SWI Domain 

0024 (CT347) xerC Integrase/recombinase 

1024 (CTS64) xerD Integrase/recombmase 



Eukaryotic-Type Chromatin Factors 



0886 (CT743) hctA 

0384 (CT046) hctB 

0878 (CT737) 

0577 (CT460) 
(JVR Exinuciease Repair System 

0096 (CT333) uvrA 

0801 (CT586) uvrB 

0940 (CT791) uvrC 

0772 (CT608) uvrD 



Histone-Dke Developmental Protein 

Histone-hke Protein 2 

SET Domain protein 

SWIB (YM74) Complex Protein 

Excinuclease ABC Subunit A 
Exmuclease ABC Subunit B 
Excinuclease ABC, Subunit C 
DNA Hehcase 



Aerobic 

0855 (CT7I4) gpdA 

0743 (CT634) nqrA 

0427 (CT278) nqr2 

0428 (CT279) nqr3 

0429 (CT280) nqr4 

0430 (CT281) nqr5 
0883 (CT740) nqr6 

A TP Biogenesis and metabolism 



Energy Metabolism 

Glycerol-3-P Dehydrogenase 

Ubiquinone Oxidoreductase, Alpha 

NADH (Ubiquinone) Dehydrogenase 

NADH (Ubiquinone) Oxidoreductase, Gamma 

NADH (Ubiquinone) Reductase 4 

NADH (Ubiquinone) Reductase 5 

Phenolhydrolase/NADH (Ubiquinone) Oxidoreductase 6 



0351 


(CT065) 


adtj 


ADP/ATP TransiocaseJ 


0614 


(CT495) 


adt_2 


A DP/ ATP Trans locase_2 


0088 


(CT308) 


atpA 


ATP Synthase Subunit A 


0089 


(CT307) 


atpB 


ATP Synthase Subunit B 


0090 


(CT306) 


atpD 


ATP Synthase Subunit D 


0086 


(CT310) 


atpE 


ATP Synthase Subunit E 


0091 


(CT305) 


atpl 


ATP Synthase Subunit I 


0092 


(CT304) 


atpK 


ATP Synthase Subunit fC 


0860 


(CT719) 


fliF 


Flagellar M-Rmg Protein 


Electron Transport Cham 




0102 


(CT013) 


cydA 


Cytochrome Oxidase Subunit I 


0103 


(CT014) 


cydB 


Cytochrome Oxidase Subunit II 


0364 


(CT059) 




Ferredoxm 


0084 


(CT312) 




Predicted Ferredoxm 


Glycolysis & Gluconeogenests 




0281 


(CT215) 


dhnA 


Predicted 1,6-Fructose Biphosphate Aldolase 


0800 


(CT587) 


eno 


Enolase 


0624 


(CT505) 


gapA 


Glyceraldehyde-3-P Dehyrogenase 


0056 


(CT295) 


mrsA 


Phosph oman nomutase 


0967 


(CT815) 


pgm 


Phosphoglucomutase 


0160 


(CT207) 


pfkAJ 


Fructose-6-P Phosphotransferase^! 


0208 


(CT205) 


pfkA_2 


Fructose-6-P Phosphotransferase_2 


1025 


(CT378) 


Pgi 


Glucose-6-P Isomerase 


0679 


(CT693) 


Pgk 


Phosphoglycerate Kinase 


0863 


(CT722) 


pgmA 


Phosphogiy cerate Mutase 


0097 


(CT332) 


pyk 


Pyruvate Kinase 


1063 


(CT328) 


tpiS 


Tnosephosphate Isomerase 


Pentose Phosphate Pathway 




0239 


(CT186) 


devB 


Glucose-6-P Dehyrogenase (DevB family) 


1060 


(CT331) 


dxs 


Transketolase 
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0360 


CCT063) 


gnd 


6-Phosphogiuconate Dehydrogenase 


0185 


(CT121) 


rpe 


Ribulose-P Epimerase 


0141 


(CT213) 


rpiA 


Ribose-5-P Isomerase A 


0083 


CCT313) 


tal 


Transaldolase 


0893 


(CT750) 


tktB 


Trans ketolase 


0238 


(CTI85) 


Zwf 


Glucose-6-P Dehyrogenase 


Pyruvate Dehydrogenase 




0833 


(CT557) 


IpdA 


Lipoamide Dehydrogenase 


0436 


(CT285) 


IplA_I 


Lipoate Protein Ligase-Like Protein 


0618 


(CT499) 


IplA_2 


Lipoate- Protein Ligase A 


0033 


(CT340) 


pdhA&B 


Oxoisovalerate Dehydrogenase ct/fi Fusio: 


0304 


(CT245) 


pdhA 


Pyruvate Dehydrogenase Alpha 


0305 


(CT246) 


pdhB 


Pyruvate Dehydrogenase Beta 


0306 


(CT247) 


pdhC 


Dihydrohpoamide Acetyltransferase 


TCA Cycle 






0495 


(CT390) 


aspC 


Aspartate Aminotransferase 


1013 


(CTS55) 


fumC 


Fumarate Hydra tase 


1028 


(CT376) 


tndhC 


Malate Dehyrogenase 


0789 


(CT592) 


sdhA 


Succinate Dehydrogenase 


0790 


(CT591) 


sdhB 


Succinate Dehydrogenase 


0788 


(CT593) 


sdhC 


Succinate Dehydrogenase 


0378 


(CT054) 


sue A 


Oxoglutarate Dehydrogenase 


0377 


(CT055) 


sucB_I 


D i hydro lipoamide Succmyl transferase] 


0527 


(CT400) 


sucB_2 


Di hydro lipoamide Succmyltransferase_2 


0973 


(CT821) 


sucC 


Succinyl-CoA Synthetase, Beta 


0974 


(CT822) 


sucD 


Succmyl-CoA Synthetase, Alpha 



Protein Folding, Assembly &. Modification 

Ckaperones 



0949 


(CT799) 


etc 


General Stress Protein 


0534 


(CT407) 


dksA 


DnaK Suppressor 


0032 


(CT34I) 


dnaJ 


Heat Shock Protein J 


0503 


(CT396) 


dnaK 


Hsp-70 


0134 


(CT1 10) 


groELJ 


Hsp-60_1 


0777 


(CT604) 


groEL_2 


Hsp-60_2 


0898 


(CT755) 


groELJS 


Hsp-60_3 


0135 


(CT11 1) 


groES 


lOKDa Chaperonm 


0502 


(CT395) 


grpE 


HSP-70 Co factor 


0661 


(CT541) 


mip 


FKJBP-type Peptidyl-prolyl Cis-Trans [somerase 



Proteases 








0144 


(CT113) 


clpB 


Clp Protease ATPase 


0437 


(CT286) 


clpC 


ClpC Protease 


0520 


(CT431) 


clpPJ 


CLP Protease 


0847 


(CT706) 


clpP_2 


CLP Protease Subunit 


0846 


(CT705) 


clpX 


CLP Protease ATPase 


0269 


(CT138) 




Dipeptidase 


0998 


(CT841) 


ftsH 


ATP-dependent Zinc Protease 


0030 


(CT343) 


gcp_l 


O-Sialoglycoprotein Endopeptidase_l 


0194 


(CTI97) 


gcp_2 


O-Sialoglycoprotem Endopeptidase_2 


0979 


(CT823) 


htrA 


DO Senne Protease 


0957 


(CTS06) 


ide 


Insuhnase family/Pro tease III 


0027 


(CT344) 


Ion 


Lon ATP-dependent Protease 


1017 


(CT859) 


lytB 


Me tal lopro tease 


1009 


(CT851) 


map 


Methionine Aminopeptidase 


0385 


(CT045) 


pepA 


Leucyl Aminopeptidase A 


0136 


(CT112) 


pepF 


Ohgopeptidase 
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0813 


(CT574) 


pepP 


Aminopeptidase P 


0613 


(CT494) 


sohB 


Protease 


0555 


(CT441) 


tsp 


Tail- Specific Protease 


0344 


(CT072) 


yaeL 


Metaiioprotease 


0981 


(CT824) 




Zinc Metaiioprotease (insuhnase family) 


Protein fsomerases 






0227 


(CTI76) 


dsbB 


Disulfide bond Oxidoreductase 


0786 


(CT595) 


dsbD 


Thio disulfide Interchange Protein 


0228 


(CT177) 


dsbG 


Disulfide Bond Chaperone 


0933 


(CT783) 




Predicted Disulfide Bond Isomerase 


0926 


(CT780) 




Thioredoxin Disulfide Isomerase 



Transcription 



RNA Degradation 

0999 (CT842) pnp Polyribonucleotide Nucleotidyltransferase 

0054 (CT297) mc Ribonuclease III 

0119 (CT029) mhBJ Ribonuclease HIIJ 

1068 (CT008) mhBJ. Ribonuclease HII_2 

0934 (CT784) rnpA Ribonuclease P Protein Component 

0504 (CT397) vacB Ribonuclease Family 
RNA Elongation <£ Termination Factors 

0741 (CT636) greA Transcription Elongation Factor 



0316 (CT097) 

0076 (CT320) 

0845 (CT704) 

0966 (CT4I0) 

0610 (CT491) 
RNA Methylases 

0674 (CT553) 

1059 (CT354) 

0187 (CT133) 

0530 (CT403) 

0660 (CT540) 

0117 (CT027) 

0885 (CT742) 

0986 (CTS29) 

0987 (CT830) 
RNA Modification 



nusA N Utilization Protein A 

nusG Transcriptional Antitermination 

pcnJEM Poly A Polymerase^ 

pcnB_2 PoIyA Polymerase_2 

rho Transcription Termination Factor 

fmu RNA Methyltransferase 

kgsA Dimethyladenosine Transferase 

Predicted Methylase 

spolM rRNA Methylase J 

spoU_2 rRNA Methylase_2 

trmD tRN A (Guanine N-1)-Methyl transferase 

ygcA rRNA Methyl transferee 

yggH Predicted rRNA Methylase 

ytgB Predicted rRNA Methylase 



0649 


(CT530) 


fmt 


Methionyl tRNA Formyltransferase 


0910 


(CT766) 


miaA 


tRN A Pyrophosphate Transferase 


0719 


(CT658) 


sfhB 


Predicted Pseudoundine Synthase 


0219 


(CTI93) 




Queuine tRNA Rjbosyl Transferase 


0580 


(CT463) 


truA 


Pseudoundylate Synthase I 


0319 


(CT094) 


truB 


tRNA Pseudoundine Synthase 


0403 


(CT106) 


yceC 


Predicted Pseudoundine Synthetase Family 


0864 


(CT723) 


yjbc 


Predicted Pseudoundine Synthase 



RNA Polymerase <& Transcription Regulators 



0586 


(CT468) 


atoC 


Two-Component Regulator 


0362 


(CT061) 


rpsD 


Sigma-2SAVhiG Family 


0501 


(CT394) 


hrcA 


HTH Transcnptional Repressor 


0793 


(CT588) 


rbsU 


Sigma Regulatory Family Protein — PP2C Phosphatase (RsbW Antagonist) 


0626 


(CT507) 


rpoA 


RNA Polymerase Alpha 


0081 


(CT315) 


rpoB 


RNA Polymerase Beta 


0082 


(CT314) 


rpoC 


RNA Polymerase Beta' 


0756 


(CT615) 


rpoD 


RNA Polymerase Sigma-66 


0771 


(CT609) 


rpoN 


RNA Polymerase Sigma-54 


0511 


(CT424) 


rsbV_l 


Sigma Regulatory Factor_l 


0909 


(CT765) 


rsbV_2 


Sigma Factor Regulator_2 


0670 


(CT549) 


rsbW 


Sigma Regulatory Factor-Histidine Kinase 


0750 


(CT630) 


tctD 


HTH Transcnptional Regulatory Protein +■ Receiver Doman 


1069 


(CT009) 


yfgA 


HTH Transcriptional Regulator 








Translation 


m 'mo Acyl tRNA Synthesis 




0892 


(CT749) 


alaS 


Alanyi tRNA Synthetase 


0570 


(CT454) 


argS 


Argmyl tRNA Transferase 


0662 


(CT542) 


aspS 


Aspartyl tRNA Synthetase 
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0932 (CT782 

0003 (CT003 

0004 (CT004 
0002 (CT002 
05(50 (CT445 
0946 (CT796; 
0663 (CT543; 
0109 (CT019; 
0153 (CT209 
0931 (CT781 
0122 (CT032 
0993 (CT836; 
0594 (CT475; 
0500 (CT393 
0870 (CT729; 
0806 (CT581 
0802 (CT585 
0361 (CT062 
0094 (CT302 



Peptide Chain Initiation, Elongation & Termination 



cysS Cystemyl tRNA Synthetase 

gatA Glu tRNA Gin Am ido transferase (A subunit) 

gatB Glu tRNA Gin Amido transferase (B Subunit) 

gatC Glu tRNA Gin Amidotransferase (C subunit) 

gitX Glutamyl-tRNA Synthetase 

glyQ Glycyl tRNA Synthetase 

hisS Histidyl tRNA Synthetase 

ileS Isoleucyl-tRNA Synthetase 

leuS Leucyl tRNA Synthetase 

lysS Lysyi tRNA Synthetase 

metG Methionyl-tRNA Synthetase. 

pheS Phenylalanyi tRNA Synthetase, Alpha 

pheT Phenylalanyi tRNA Synthetase Beta 

proS Prolyl tRNA Synthetase 

serS Seryi tRNA Synthetase_2 

thrS Threonyl tRNA Synthetase 

trpS Tryptophanyl tRNA Synthetase 

tyrS Tyrosyl tRNA Synthetase 

valS Valyl tRNA Synthetase 



1067 


(CT353) 


def 


Polypeptide Deformylase 


0184 


(CT122) 


efp_l 


Elongation Factor P 1 


0895 


(CT752) 


efp_2 


Elongation Factor P 2 


0550 


(CT437) 


fusA 


Elongation Factor G 


0073 


(CT323) 


mfA 


Initiation Factor IF* I 


0317 


(CT096) 


infB 


Initiation Factor-2 


0990 


(CT833) 


infC 


Initiation Factor 3 


0113 


(CT023) 


pfrA 


Peptide Chain Releasing Factor 1 


0576 


(CT459) 


prfB 


Peptide Chain Release Factor 2 


0950 


(CT800) 


pth 


Peptidyl tRNA Hydrolase 


0318 


(CT095) 


rbfA 


Ribosome Binding Factor A 


0699 


(CT677) 


rrf 


Ribosome Releasing Factor 


0697 


(CT679) 


tsf 


Elongation Factor TS 


0074 


(CT322) 


tufA 


Elongation Factor Tu 


Ribosomal Proteins 






0078 


(CT318) 


HI 


LI Ribosomal Protein 


0644 


(CT525) 


H2 


L2 Ribosomal Protein 


0647 


(CT528) 


rl3 


L3 Ribosomal Protein 


0646 


(CT527) 


rl4 


L4 Ribosomal Protein 


0635 


(CT516) 


r!5 


L5 Ribosomal Protein 


0633 


(CT514) 


r!6 


L6 Ribosomal Protein 


0080 


(CT316) 


rl7 


L7/L12 Ribosomal Protein 


0953 


(CT803) 


r!9 


L9 Ribosomal Protein 


0079 


(CT317) 


rllO 


L10 Ribosoma! Protein 


0077 


(CT319) 


rll 1 


Lll Ribosomal Protein 


0247 


(CT125) 


rll3 


L13 Ribosomal Protein 


0637 


(CT518) 


rll4 


LI 4 Ribosomal Protein 


0630 


(CT51I) 


H15 


LIS Ribosomal Protein 


0640 


(CT521) 


rH6 


L 1 6 Ribosomal Protein 


0625 


(CT506) 


ril7 


LI7 Ribosomal Protein 


0632 


(CT513) 


rllS 


LI 8 Ribosomal Protein 


0118 


(CT028) 


rll9 


LI 9 Ribosomal Protein 


0992 


(CT835) 


r!20 


L20 Ribosomal Protein 


0546 


(CT420) 


rl21 


L2I Ribosomal Protein 


0642 


(CT523) 


rl22 


L22 Ribosomal Protein 


0645 


(CT526) 


H23 


L23 Ribosomal Protein 
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0636 


<CT5IT) 


H24 


L24 Ribosomal Protein 


0545 


(CT4I9) 


r!27 


L27 nbosomal protein 


0327 


(CT086) 


r!28 


L28 Ribosomal Protein 


0639 


(CT520) 


H29 


L29 Ribosomal Protein 


0112 


(CT022) 


r!31 


L3 1 Ribosomal Protein 


0961 


(CT810) 


rI32 


L32 Ribosomal Protein 


0250 


(CT150) 


r!33 


L33 Ribosomal Protein 


0935 


(CT785) 


ri34 


L34 Ribosomal Protein 


0991 


(CT834) 


rI35 


L35 Ribosomal Protein 


0936 


(CT786) 


r!36 


L36 Ribosomal Protein 


0315 


(CT098) 


rsl 


Si Ribosomal Protein 


0696 


(CT680) 


rs2 


S2 Ribosomal Protein 


0641 


(CT522) 


rs3 


S3 Ribosomal Protein 


0733 


(CT626) 


rs4 


S4 Ribosomal Protein 


0631 


(CT512) 


rs5 


S5 Ribosomal Protein 


0951 


(CT801) 


rs6 


S6 Ribosomal Protein 


0551 


(CT438) 


rs7 


S7 Ribosomal Protein 


0634 


(CT515) 


rs8 


S8 Ribosomal Protein 


0246 


(CTI26) 


rs9 


S9 Ribosomal Protein 


0549 


CCT436) 


rslO 


S10 Ribosomal Protein 


0627 


(CT508) 


rsll 


SI 1 Ribosomal Protein 


0552 


(CT439) 


rsl2 


S12 Ribosomal Protein 


0628 


(CT509) 


TS13 


S13 Ribosomal Protein 


0937 


(CT787) 


rsl4 


SI 4 Ribosomal Protein 


1000 


(CT843) 


rsl5 


SI 5 Ribosomal Protein 


0116 


(CT026) 


rsl6 


SI 6 Ribosomal Protein 


0638 


(CT519) 


rs!7 


SI 7 Ribosomal Protein 


0952 


(CT802) 


T318 


SI 8 Ribosomal Protein 


0643 


(CT524) 


rsl9 


S19 Ribosomal Protein 


0754 


(CT617) 


rs20 


S20 Ribosomal Protein 


0031 


(CT342) 


rs21 


S21 Ribosomal Protein 



Other Categories 



Chlamydia -Specific Proteins 






0561 


(CT446) 


Euo 




CHLPS Euo Protein 


0804 


(CT583) 


Gp6D 


CHLTR Plasmid Paralog 


0186 


(CTI19) 






Similarity to IncA l 


0291 


(CT232) 


mcB 




Incluston Membrane Protein B 


0292 


(CT233) 


incC 




Inclusion Membrane Protein C 


1026 


(CT377) 






LtuA Protein 


0333 


(CT080) 






LtuB Protein 


0005 


(CT871) 


pmp. 


_1 


Polymorphic Outer Membrane Protein G Family 


0013 


(CT871) 


pmp. 


2 


Polymorphic Outer Membrane Protein G Family 


0014 


(CT871) 


pmp_ 


,3 


Polymorphic Outer Membrane Protein G Family 


0015 


(CT871) 


pmp_ 


3 


PMP_3 (frame-shift with 0014) 


0016 


(CT874) 


pmp_ 


4 


Polymorphic Outer Membrane Protein G Family 


0017 


(CT871) 


pmp_ 


4 


PMP_4 {frame-shift with 0016) 


0018 


(CT874) 


pmp. 


_5 


Polymorphic Outer Membrane Protein G Family 


0019 


(CT871) 


pmp. 


_5 


PMP_5 (frame-shift with 0018) 


0444 


{CT871) 


pmp_ 


6 


Polymorphic Outer Membrane Protein G/I Family 


0445 


(CT871) 


pmp 


1 


Polymorphic Outer Membrane Protein G Family 


0446 


(CT871) 


pmp_ 


% 


Polymorphic Outer Membrane Protein G Family 


0447 


(CT871) 


pmp. 


9 


Polymorphic Outer Membrane Protein G/I Family 


0450 


(CT871) 


pmp_ 


JO 


Polymorphic Outer Membrane Protein G Family 


0449 


(CTS71) 


pmp. 


JO 


PMP_10 (Frame-shift with 0450) 
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0451 


(CT371) 


pmpl 1 


Polymorphic Outer Membrane Protein G Family 


0452 


(CT874) 


pmp_1 2 


Polymorphic Outer Membrane Protein (truncated) A* 


0453 


{CT871) 


pmp_I3 


Polymorphic Outer Membrane Protein G Family 


0454 


(CT872) 


pmp_l 4 


Polymorphic Outer Membrane Protein H Family 


0466 


(CT869) 


pmp__ 1 5 


Polymorphic Outer Membrane Protein E Family 


0467 


(CT869) 


pmp_16 


Polymorphic Outer Membrane Protein E Family 


0468 


(CT869) 


pmp_l 7 


Polymorphic Outer Membrane Protein E Family 


0469 


(CT869) 


pmp_17 


PMPJ7 (Frame-shift with 0468) 


0470 


(CT869) 


pmp_l 7 


PMP J 7 (Frame-shift with 0469) 


0473 


(CT870) 


pmp_I8 


Polymorphic Outer Membrane Protein E/F Family 


0539 


(CT412) 


pmp_19 


Polymorphic Membrane Protein A Family 


0540 


(CT413) 


pmp_20 


Polymorphic Membrane Protein B Family 


0963 


(CT812) 


pmp_2 1 


Polymorphic Membrane Protein D Family 


0562 






CHLPS 43 kDa Protein Homologj 


0927 






CHLPS 43 kDa Protein Homolog_2 


0928 






CHL^S 43 kDa Protein Homolog_3 


0929 






CHL J S 43 kDa Protein Homolog_4 


0728 


(CT622) 




CHL >N 76kDa Homologj (CT622) 


0729 


(CT623) 




CHLPN 76kDa Homolog_2 (CT623) 


0133 


(CTI09) 




CHLPS Hypothetical Protein 


0332 


(CT081) 




CHL RT2 Protein 


isceilaneous Enzymes/Conserved Protf ens 


0193 




argR 


Possi le Argmme Repressor 


1046 






Arorr atic Ammo Acid Hydroxylase 


0232 






Similanty to S'-Methylthtoadenosine Nucleosidase 


0128 


(CT035) 




Biotm Protein Ligase 


0513 


(CT426) 




Fe-S Oxidoreductase_ 1 


091 1 


(CT767) 




Fe-S Oxtdoreductase_2 


0373 


(CT057) 


gcpE 


GcpE Protein 


0407 


(CT103) 




HAD Superfamily Hydrolase/Phosphatase 


0917 


(CT771) 




Hydrolase/Phosphatase Homolog 


0488 


(CT385) 


ycfF 


HIT Family Hydrolase 


0701 


(CT675) 


karG 


Argmme Kinase 


0526 


(CT399) 


kpsF 


GutQ/KpsF Family Sugar- P Isomerase 


0919 


(CT773) 


Idh 


Leucine Dehydrogenase 


0022 


(CT349) 


maf 


Maf protein 


0997 


(CT840) 


mesJ 


PP-loop superfamily ATPase 


0151 


(CT148) 


mhpA 


Monooxygenase 


0730 


(CT624) 


mviN 


Integral Membrane Protein 


0861 


(CT720) 




NifU-Related Protein 


0479 


(CT380) 


phnP 


Metal Dependent Hydrolase 


0106 


(CT015) 


phoH 


ATPase 


0329 


(CT084) 




Phopholipase D Superfamily 


0435 


(CT284) 




Phosphohpase D Superfamily 


0581 


(CT464) 




Phosphoglycolate Phosphatase 


0897 


(CT754) 




Predicted Phosphohydrolase 


0509 


(CT422) 




Predicted Metalloenzyme 


1030 


(CT375) 




Predicted D-Amino Acid Dehyrogenase 


0531 


(CT404) 




SAM Dependent Methyl transferase 


0337 


(CT076) 


smpB 


Small Protein B 


0394 


(CT256) 


tlyC_l 


CBS Domain Protein (Hemolysin Homolog)_l 


0510 


(CT423) 


tlyC_2 


CBS Domains (Hemolysin Homolog)_2 


0382 


(CT048) 


yabC 


SAM-Dependent Me thy transferase 


0787 


(CT594) 


yabD 


PHP Superfamily (Urease/Pynmidinase) Hydrolase 


061 1 


(CT492) 


yacE 


Predicted Phosphatase/Kinase 


0579 


(CT462) 


yacM 


Sugar Nucleotide Phosphorylase 


0578 


(CT461) 


yael 


Phosphohydrolase 



65 



0345 


(CT071) 


yaeM 


CT071 Hypothetical Protem 


0566 


(CT450) 


yaeS 


YaeS family Hypothetical Protein 


0591 


(CT472) 


yagE 


YagE family 


0039 


(CT335) 


ybaB 


YbaB family Hypothencal Protein 


0101 


(CT012) 


ybbP 


YbbP family Hypothetical Protein 


0915 


(CT769) 


ybeB 


lojap Superfamily Ortholog 


0137 


(CTI08) 


ybgl 


ACR family 


0529 


(CT402) 


ycaH 


ATPase 


0438 


(CT287) 


ycbF 


PP-loop Superfamily ATPase 


0734 


(CT627) 


yceA 


YceA Hypothetical Protem 


0954 


(CT804) 


ychB 


Predicted Kinase 


0261 


(CT217) 


ydaO 


PP-Loop Superfamily ATPase 


0245 


(CT127) 


ydhO 


Polysaccharide Hydrolase- 1 nvasin Repeat Family 


0573 


(CT457) 


yebC 


YebC Family Hypothetical Protein 


0689 


(CT687) 


yfhO_l 


NifS-related Aminotransferase! 


0862 


(CT721) 


yfhO_2 


NifS-related Aminotransferase^ 


0547 


(CT434) 


ygbB 


YgbB Family Hypothencal Protein 


0237 


(CT184) 


yggF 


YggF Family Hypothetical Protem 


0775 


(CT606) 


ygg v 


YggV Family Hypothetical Protem 


0396 


(CT258) 


yhfO_3 


NifS-related Amino transferase 3 


0605 


(CT487) 


yhhF 


Predicted Methylase 


0575 


(CT458) 


yhhY 


Ammo Group Acetyl Transferase 


0592 


(CT473) 


yidD 


YidD Family 


0982 


(CT825) 


yigN 


YigN Family Hypothetical Protein 


0657 


(CT537) 


yjeE 


YjeE Hypothetical Protein 


0768 


(CT644) 


yohl 


Yohl Predicted Oxidoreductase 


0336 


(CT077) 


yojL 


YojL Hypothencal Protem 


02 1 7 


(CTI40) 


vndP 


YpdP Hypothetical Protein 


0140 


(CT212) 


yqdE 


YqdE Hypothetical Protein 


0263 


(CT221) 


yqfU 


YqfU Hypothetical Protem 


0139 


(CT211) 


yqgE 


YqgE Hypothetical Protein 


0270 


(CT137) 


ywlC 


SuA5 Superfamily- related Protem 


0879 


(CT738) 


yycJ 


Metal Dependent Hydrolase 








Homologj to CHLTR Hypothetu 


0001 


(CT001) 


CT001 Hypothetical Protein 


0020 


(CT351) 


CT351 Hypothetical Protein 


0021 


(CT350) 


CT350 Hypothetical Protein 


0026 


(CT345) 


CT345 Hypothetical Protein 


0035 


(CT339) 


CT339 Hypothetical Protein 


0036 


(CT338) 


CT338 Hypothetical Protem 


0055 


(CT296) 


CT296 Hypothetical Protem 


0062 


(CT289) 


CT289 Hypothetical Protem 


0065 


(CT288) 


CT2S8 Hypothetical Protem 


0068 


(CT360) 


CT360 Hypothetical Protein 


0071 


(CT325) 


CT325 Hypothetical Protein 


0072 


(CT324) 


CT324 Hypothetical Protein 


0085 


(CT3 1 1 ) 


CT31 1 Hypothetical Protein 


0087 


(CT309) 


CT309 Hypothetical Protein 


0093 


(CT303) 


CT303 Hypothetical Protem 


0100 


(CT011) 


CT01 1 Hypothetical Protein 


0104 


(CT017) 


CT017 Hypothetical Protem 


0105 


(CT016) 


CT016 Hypothetical Protein 


0107 


(CT058) 


CT058 Hypothetical ProtemJ 


0108 


(CT018) 


CT018 


Similarity 


0111 


(CT02 1 ) 


CT02 1 Hypothetical Protein 


0121 


(CT03 1 ) 


CT03 1 Hypothetical Protein 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



0129 


(CT036) 


CT036 Similarity 


0145 


(CT1 14) 


CT1 14 Hypothetical Protein 


0150 


(CT147) 


CT147 Hypothetical Protein 


0152 


(CT149) 


CT149 Hypothetical Protein 


0176 


(CT153) 


CT153 Hypothetical Protein 


0188 


(CT132) 


CT132 Hypothetical Protein 


0189 


(CTI31) 


CT131 Hypothetical Protein 


0206 


(CT203) 


CT203 Hypothetical Protein 


0229 


(CT178) 


CT178 Hypothetical Protein 


0230 


(CTI79) 


CT179 Hypothetical Protein 


0234 


(CT1S1) 


CT181 Hypothetical Protein 


0249 


(CTI5I) 


CT1 5 1 Hypothetical Protein 


0253 


(CT144) 


CT144 Hypothetical Protem_l 


0254 


(CT143) 


CT143 Hypothetical Protem_l 


0255 


(CT142) 


CT142 Hypothetical Protetnl 


0256 


(CT144) 


CT144 Hypothetical Protem_2 


0257 


(CT143) 


CT143 Hypothetical Protem_2 


0259 


(CT142) 


CT 1 42 Hypothetical Protein_2 


0276 


(CTI91) 


CT191 Hypothetical Protein 


0288 


(CT195) 


CT195 Hypothetical Protein 


0293 


(CT234) 


CT234 Hypothetical Protein 


0301 


(CT242) 


CT368 Hypotheacal Protein 


0303 


(CT244) 


CT244 Hypothetical Protein 


0308 


(CT249) 


CT249 Similanty 


0312 


(CT101) 


CHOI Hypothetical Protein 


0328 


(CT085) 


CT085 Hypothetical Protein 


0330 


(CT083) 


CT083 Hypothetical Protein 


0331 


(CT082) 


CT082 Hypothetical Protein 


0334 


(CT079) 


CT079 Similanty 


0342 


(CT073) 


CT073 Hypothetical Protein 


0343 


(CT073) 


(frame-shift with 0342?) 


0350 


(CT066) 


CT066 Hypothetical Protein 


0369 


(CT058) ' 


CT058 Hypothetical Protem_2 


0370 


(CT058) 


CT058 Hypothetical Protem_3 


0374 


(CT056) 


CT056 Hypothetical Protein 


0379 


(CT053) 


CT053 Hypothetical Protein 


0381 


(CT326) 


CT326 Similarity 


0383 


(CT047) 


CT047 Hypothetical Protem 


0387 


(CT043) 


CT043 Hypothetical Protein 


0389 


(CT041) 


CT04I Hypothetical Protein 


0393 


(CT038) 


CT038 Hypothetical Protein 


0395 


(CT257) 


CT257 Hypothetical Protein 


0399 


(CT253) 


CT253 Hypothetical Protein 


0400 


(CT254) 


CT254 Hypothetical Protein 


0401 


(CT255) 


CT255 Hypothetical Protein 


0405 


(CT105) 


CT105 Hypothetical Protein 


0408 


(CT102) 


CT102 Hypothetical Protem 


0409 


(CT260) 


CT260 Hypotheacal Protein 


0411 


(CT262) 


CT262 Hypothetical Protem 


0412 


(CT263) 


CT263 Hypothetical Protem 


0415 


(CT266) 


CT266 Hypothetical Protein 


0420 


(CT271) 


CT27I Hypothetical Protein 


0422 


(CT273) 


CT273 Hypothetical Protein 


0423 


(CT274) 


CT274 Hypothetical Protein 


0425 


(CT276) 


CT276 Hypothetical Proteins 


0426 


(CT277) 


CT277 Similarity 


0434 


(CT283) 


CT283 Hypotheacal Protein 



67 



0441 (CT007) 

0442 (CT006) 

0443 (CT005) 
0474 (CT365) 
0476 (CT865) 
0480 (CT383) 
0485 (CT382) 
0487 (CT384) 

0489 (CT386) 

0490 (CT387) 

0491 (CT389) 

0496 (CT39I) 

0497 (CT388) 

0506 (CT421) 

0507 (CT421) 

0508 (CT421) 
0512 (CT425) 
05 1 4 (CT427) 
0518 (CT429) 
0522 (CT433) 
0525 (CT398) 
0533 (CT406) 

0537 (CT814) 

0538 (CT814) 
0554 (CT440) 
0559 (CT441) 
0565 (CT449) 
0572 (CT456) 

0582 (CT465) 

0583 (CT466) 

0588 (CT469) 

0589 (CT470) 

0590 (CT471) 
0593 (CT474) 
0595 (CT476) 

0601 (CT483) 

0602 (CT484) 
0606 (CT488) 
0609 (CT490) 

0622 (CT503) 

0623 (CT504) 
0648 (CT529) 
0658 (CT53S) 

0667 (CT546) 

0668 (CT547) 

0669 (CT548) 
0671 (CT550) 
0673 (CT552) 

0675 (CT696) 

0676 (CT695) 
0681 (CT691) 

0687 (CT482) 

0688 (CT481) 
0700 (CT676) 

0705 (CT671) 

0706 (CT670) 
0708 (CT668) 



CT007 Hypothetical Protein 
CT006 Hypothetical Protein 
CT005 Hypothetical Protein 
CT365 Hypothetical Protein 
CT865 Hypothetical Protein 
CT383 Hypothetical Protein 
CT382.1 Hypothetical Pro tern 
CT384 Hypothetical Protein 
CT386 Hypothetical Protein 
CT387 Hypothetical Protein 
CT389 Hypothetical Protein 
CT391 Hypothetical Protein 
CT388 Hypothetical Protein 
CT42 1 Hypothetical Protein 
CT421 1 Hypothetical Protein 
CT42 1 2 Hypothetical Protein 
CT425 Hypothetical Protein 
CT427 Hypothetical Protein 
CT429 Hypothetical Protein 
CT433 Hypothetical Protein 
CT398 Hypothetical Protein 
CT406 Hypo the ncai Protein 
CT814 1 Hypothetical Protein 
CT814 Hypothetical Protein 
CT440 Hypothetical Protein 
CT44I.I Hypothetical Protein 
CT449 Hypothetical Protein 
CT456 Hypothetical Protein 
CT465 Hypothetical Protein 
CT466 Hypothetical Protein 
CT469 Hypothetical Protein 
CT470 Hypothetical Protein 
CT471 Hypothetical Protein 
CT474 Hypothetical Protein 
CT476 Hypothetical Protein 
CT483 Hypothetical Protein 
CT484 Hypothetical Protein 
CT488 Hypothetical Protein 
CT490 Hypothetical Protein 
CT503 Hypothetical Protein 
CT504 Hypothetical Protein 
CT529 Hypothetical Protein 
CT538 Hypothetical Protein 
CT546 Hypothetical Protein 
CT547 Hypothetical Protein 
CT548 Hypothetical Protein 
CT550 Hypothetical Protein 
CT552 Hypothetical Protein 
CT696 Hypothetical Protein 
CT695 Similarity 
CT69 1 Hypothetical Protein 
CT482 Hypothetical Protein 
CT48 1 Hypothetical Protein 
CT676 Hypothetical Protein 
CT671 Hypothetical Protein 
CT670 Hypothetical Protein 
CT668 Hypothetical Protein 



0709 


(CT667) 


0710 


(CT666) 


0711 


(CT665) 


0713 


<CT663) 


0717 


(CT656) 


0718 


(CT657) 


0720 


(CT659) 


0722 


(CT654) 


0725 


(CT652) 


0726 


(CT620) 


0727 


(CT619) 


0739 


(CT638) 


0742 


(CT635) 


0746 


(CT632) 


0747 


(CT631) 


0751 


{CT65 1 ) 


0755 


(CT616) 


0760 


(CT611) 


0761 


(CT610) 


0764 


(CT648) 


0765 


(CT647) 


0766 


(CT646) 


0767 


(CT645) 


077O 


(CT642) 


0774 


(CT606) 


0776 


(CT605) 


0779 


(CT602) 


0783 


(CT598) 


0791 


(CT590) 


0792 


(CT589) 


0803 


(CT584) 


0807 


(CT580) 


0808 


(CT579) 


0809 


(CT578) 


0810 


(CT577) 


0814 


(CT573) 


0818 


(CT569) 


0819 


(CT568) 


0820 


(CT567) 


0821 


(CT566) 


0822 


(CT565) 


0827 


(CT560) 


0834 


(CT556) 


0840 


(CT70O) 


0842 


(CT702) 


0843 


(CT702) 


0852 


{CT711) 


0853 


(CT712) 


0857 


(CT716) 


0859 


(CT718) 


0865 


(CT724) 


0869 


(CT728) 


0874 


(CT733] 


0875 


(CT734] 


0884 


(CT74I 


0887 


(CT744 


0896 


(CT753 



CT667 Hypothetical Protein 
CT666 Hypothetical Protein 
CT665 Hypothetical Protein 
CT663 Hypothetical Protein 
CT656 Hypothetical Protein 
CT657 Hypothetical Protein 
CT659 Hypothetical Protein 
CT654 Hypothetical Protein 
CT652.1 Hypothetical Protein 
CT620 Hypothetical Protein 
CT619 Hypothetical Protein 
CT368 Hypothetical Protein 
CT635 Hypothetical Protein 
CT632 Hypothetical Protein 
CT631 Hypothetical Protein 
CT651 Hypothetical Protein 
CT6I6 Hypothetical Protein 
CT61 1 Hypothetical Protein 
CT6I0 Hypothetical Protein 
CT648 Hypotheti ,al Protein 
CT647 Hypotheti al Protein 
CT646 Hypothec al Protein 
CT645 Hypotheti al Protein 
CT642 Hypotheti ;ai Protein 
CT606 1 Hypothetical Protein 
CT605 Hypothetical Protein 
CT602 Hypothetical Protein 
CT598 Hypothetical Protein 
CT590 Hypothetical Protein 
CT589 Hypothetical Protein 
CT584 Hypothetical Protein 
CT580 Hypothetical Protein 
CT579 Hypothetical Protein 
CT578 Hypothetical Protein 
CT577 Hypothetical Protein 
CT573 Hypothetical Protein 
CT569 Hypothetical Protein 
CT568 Hypothetical Protem 
CT567 Hypothetical Protem 
CT566 Hypothetical Protem 
CT565 Hypothetical Protein 
CT560 Hypothetical Protein 
CT556 Hypothetical Protein 
CT700 Hypothetical Protein 
CT702 Hypothetical Protein 
CT702 Hypothetical Protein 
CT7U Hypothetical Protem 
CT712 Hypothetical Protem 
CT716 Hypothetical Protem 
CT718 Hypothetical Protein 
CT724 Hypothetical Protein 
CT728 Hypothetical Protein 
CT733 Hypothetical Protein 
CT734 Hypothetical Protem 
CT741 Hypothetical Protein 
CHLTR Possible Phosphoprotem 
CT753 Hypothetical Protem 



0906 


(CT763 ) 


CT763 Hypothetical Protein 


0908 


(CT764) 


CT764 Hypothetical Protein 


0912 


(CT768) 


CT768 Hypothetical Protem 


0925 


(CT779) 


CT779 Hypothetical Protein 


0938 


(CT788) 


CT788 Hypothetical Protein 


0939 


(CT790) 


CT790 Hypothetical Protein 


0943 


(CT794) 


CT794. 1 Hypothetical Protein 


0945 


(CT795) 


CT795 Hypothetical Protein 


0956 


(CT805) 


CT805 Hypothetical Protein 


0960 


(CT809) 


CT809 Hypothetical Protein 


0989 


(CT832) 


CTS32 Hypothetical Protein 


0994 


(CT837) 


CT837 Hypothetical Protein 


0995 


(CT838) 


CT838 Hypothetical Protein 


0996 


(CTS39) 


CT839 Hypothetical Protein 


1002 


(CT845) 


CT845 Hypothetical Protein 


1003 


(CTS46) 


CT846 Hypo the tical Protein 


1004 


(CT847) 


CT847 Hypothetical Protein 


1005 


(CT848) 


CT848 Hypothetical Protein 


1006 


(CT849) 


CT849 Hypothetical Protein 


1007 


(CT849) 


CT849 1 Hypothetical Protem 


1008 


(CT850) 


CT850 Hypothetical Protein 


1010 


(CT852) 


CT852 Hypothetical Protein 


10U 


(CT853) 


CT853 Hypothetical Protein 


1015 


(CT857) 


CT857 Hypothetical Protein 


1016 


(CT858) 


CT858 Hypothetical Protein 


1019 


(CT860) 


CT860 Hypothetical Protein 


1020 


(CT861) 


CT861 Hypothetical Protein 


1022 


(CT863) 


CT863 Hypothetical Protein 


1032 


(CT373) 


CT373 Hypothetical Protein 


1033 


(CT372) 


CT372 Hypothetical Protein 


1034 


(CT371) 


CT37 1 Hypothetical Protein 


1057 


(CT356) 


CT356 Hypothetical Protein 


1058 


(CT355) 


CT355 Hypothetical Protein 


1061 


(CT330) 


CT330 Hypothetical Protein 


1073 


(CT371) 


CT371 Hypothetical Protein 



Coding Genes Not in C. trachomatis 

0486 Hypothetical Proline Permease 

0279 Possible ABC Transporter Permease Protein 

0505 3-Methylademne DNA Glycosylase 

0193 argR Similarity to Argmine Repressor 

1041 bio A Adenosylmethionine-8-Amino-7-Oxononanoate Aminotransferase 

1044 bioB Biotm Synthase 

1042 btoD Dethiobiotin synthetase 
0585 Similarity to Cps IncA_2 

0562 CHLPS 43 kDa Protein HomologJ 

0927 CHLPS 43 kDa Protein Homolog_2 

0928 CHLPS 43 kDa Protem Homolog_3 

0929 CHLPS 43 kDa Protem Homology 

1045 Conserved Hypothetical Membrane Protein 
0251 Conserved Hypothetical Protein 

0278 Conserved Outer Membrane Lipoprotein Protein 

0907 CutA-like Periplasms Divalent Caoon Tolerance Protein 

01 71 guaA GMP Synthase 

0172 guaB Inosme 5'-Monophosphase Dehydrogenase 
0608 Undine 5 '-Monophosphate Synthase 
0735 Undine Kinase 

70 



0980 Similar to Saccharomvces cerevisiae 52 9KDa Protein 

0232 Similarity to S'-Methylthioadenosine Nucleosidase 

1046 Tryptophan Hydroxylase 

0477 yqeV_Bs Conserved Hypothetical Protein 

0048 yqfF-Bs Conserved Hypothetical IM Protein 

0587 yvyD_Bs Conserved Hypothetical Protein 

0143 y*jG_Bs_l Conserved Hypothetical Protein 

044 § yxjG_Bs_2 Conserved Hypothetical Protein 





0006 


0180 


0440 


0977 


10 


0007 


0181 


0455 


0978 




0008 


0190 


0456 


1018 




0009 


0203 


0457 


1023 




0010 


0204 


0458 


1027 




0011 


0205 


0459 


1029 


15 


0012 


0209 


0460 


1040 




0028 


0210 


0461 


1051 




0029 


021 1 


0462 


1052 




0034 


0212 


0463 


1053 




0041 


0213 


0464 


1054 


20 


0042 


0214 


0465 


1055 




0043 


0215 


0472 


1056 




0044 


0216 


0473 


1064 




0045 


0218 


0481 


1065 




0046 


0220 


0483 


1066 


25 


0047 


0221 


0492 


1070 




0049 


0222 


0493 


1071 




0050 


0223 


0494 


1072 




0051 


0224 


0498 






0063 


0225 


0499 




30 


0064 


0226 


0516 






0066 


0233 


0517 






0067 


0240 


0523 






0069 


0241 


0524 






0070 


0242 


0553 




35 


0099 


0243 


0574 






0124 


0266 


0600 






0125 


0267 


0656 






0126 


0268 


0664 






0130 


0277 


0677 




40 


0131 


0283 


0678 






0132 


0284 


0685 






0142 


0285 


0686 






0146 


0287 


0724 






0147 


0352 


0731 




45 


0155 


0353 


0745 






0156 


0354 


0753 






0157 


0355 


0794 






0158 


0356 


0795 






0159 


0357 


0796 




50 


0162 


0358 


0797 






0163 


0365 


0798 






0164 


0366 


0799 






0165 


0367 


0829 






0166 


0368 


0830 




55 


0167 


0371 


0831 






0168 


0372 


0881 






0169 


0375 


0882 
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0170 
0173 
0174 
0175 
0177 
0178 
0179 



0376 
0391 
0398 
0404 
0431 
0432 
0439 



0913 
0914 
0930 
0944 
0964 
0975 
0976 



Table 3 



Chlamydia pneumoniae Genome Encoded Protein* 

CPn_000l 330 4 

CT001 hypothetical protein 

KRLKDEIKYTSLRRKAMLGK 1 1 RGLSSLIVI LCALNVGL IG ITHNKLNI I AKLCCGVSTP 
ATQ ITYI I IGIACVrCLLSFCPFCSKKSRHSHCDSCSSGGCHSHHSDKN 

CPn_ono^ '570 87S 

i. t i ■ : ■ : : ij * i--;jA . . ; L Am i r ' t i > < ut ! ' ( 

! iM" v j F ! U XI' r.F. L ! . i . i ./■ r "\. < A . . f^T *AV I'V *M]' r A. " T \C"/T~rXF 
MHVVNP/EDLREDSVTSDFNREEFLRNVPESLGGLVKVPAVIK 

CPn_0003 889 2370 

gatA-Glu tRNA Gin Amidotransf erae 

K I MYRYSALELAKAVTLGELTATGVTQHFFHR IEEAEGOVGAFISLCKEQALEQAELIDK 
KRSRGEPLGKLAGVPVGIKDNIHVTGLKTTCASRVLENYQPP 

KLNMDEFAMGSTTLYSAFHPTHNPWDLSRVPGGSSGGSAAAVSARFCPVALGSOTGGSIR 
Q P AAFC GWG FK P SYG AVSRYG LVAF AS SLDQ I G P LANTVEDVALMMDVFSG RD PKDAT S 
REFFRDSFMSKLSTEVPKVIGVPRTFLEGIJIDDIRENFFSSLAIFEGEGTHLVDVELDIL 
SHAVSIYYILASAEAATNLARFDGVRYGYRSPQAHTISQLYDLSRGEGFGKEVMRRILLX3 
^^fVLS AERQNVYYKKATAVRAK IVKAFRT AF EKC E I LAM PVCS S PAFEI GE ILDPVTLYL 
QDIYTVAMNLAYLPAI AVPSGFSKEGLPLGLQ I IGQQGQDQQVCQVGYSFQEHAOIKQLF 
SKRYAKSWLGGQS 

CPn_0004 2334 3833 

gatB- (Pet 112) Glu tRNA Gin Am idotransf erase (B Subunit) 

EICQKCCSRRSIMSAVYADWESVIGLEVHVELNTASKLFSSALNFU^GD^ 

LPGSLPVIiNQSAVEKAVLFGCAVEGEISLLSRFDRKSYFYPDSPRNFQITQFEHPIIRGG 

R I KA I VQGE ER YF ELAQTH I ED DAGML KH FG E F AGVDYNRAGVP L I EIVSKPCMFCPEDA 

VAYAT SLVSLLDY IG ISDCNMEEGS I RFDVNVSVRP KGS P ELRNKVE I KNMNSFAFMAQA 

L EAEKQRQ I DEY LNQ PNKD PKL V I PAATYRWD PEKK KT\/LMRLK£S AEDYXY F P EPDLPT 

LOLTESYIERIRKTLPELPYDKYHRYIQEYGLSEDIASILISDKNIATFFEVACKDCKNF 

RSLSNWVTVEFGGRCKTLGVKLPSSGIFPEGVAOLVNAIDCCVITGKIAKEIADI^IMESP 

GKNPEEILKEKPEIXPMSDEGELQKIIAEVVLANPESIVDYKMjKTK^ 

AG KAP PKRVNE T ,T ,T ,T .ELDKG 

. CPn_0005 4097 6892 

pmp_l- Polymorphic Outer Membrane Protein 

SD IHFDLGTKMR FSLCGFPLVF SFTIXSVFDTSLS ATT ISLT PEDSFHGDSQNAERSYNV 
OAGDVY S LTG DV S I SNVDNSALNKAC FNVTSG SVT F AGNHHGL YFNN I S SGTT KEGAVLC 
CQJ3F£ATARFSGFSTLSFIQSPGDIKEC<KLYSKNAI^ 

GA^t^IVG^^y^}SVSFYQNAATFGGAIHSSGPLQIAVNQA£IRFAQNTAKNGSGGALYSDG 
DI,P$PQNAYVLFRENEALTTAr GKGGAVCCLPTSGS ST P VP I VT FS DNKQLVF ERNHS I M 
GG^YARKI^ISSGGPTLFINNISYANSQNLGGAIAID^ 
SLPimiGIHL'I^NAKFLKLQAR^ 

SG EKSLAND P RD FK S T I PQNVNLS AG YLV I KEGAEVTVS KFTQ S PG SHLVLD LGTKL I AS 
KEplAlTGLAIDIDSLSSSSTAAVIKAOTAtnCQISVTDSIELISPTGNAYEDLRMRN 
F P ££S L E PG AGG SVTVTAG DF L PVS PHYG FQGNWKIAOTGTGNKVG £F FWDK INYK PRP E 
■ KE&SfLVPN I LWGNAVDVRS LMQVQETHAS S I^QTDRGLWI DGIGNFFHVSASEDNI RYRHN 
SGGYVLSVNNE I T PKHYT SMAF SCLF SRDKDY A VSNNEYRMY LG SYL YQ YTT SLGN I FRY 
ASfU#NVNVG I LSRRF LQN P LM I FH F LCAYGH ATNDMKT DYANF PMVKNSWRNNCWAIEC 
GGSMPLLVFENGRLFQGAI PFMKLQLVYAYQGDFKETTADGRRFSNGSLTSI SVPLGIRF 
EKlAlLSQDVLYDFS FSYI PDI FRKDPSCEAALVI SGDSWLVPAAHVSRHAFVGSGTGRYH 
FN^YfELLCRGSIECRPHARNYNINCGSKFRF 

CPn_0006 7299 7141 

No '"robust homo log present in Genebank/EMBL as of 11/7/98 
KQ|QiEPLRS ALLERLS EWLVLLGVPS PETTRST PEKDANQLPKDSRNRTLESL 

CPi|dQ007 7498 10496 

Nolrobust homolog present m Genebank/EMBL as of 11/7/98 
KSiRY^LSLIFSFLWrPLTDSTTSSLSTSLLDEGNPQSMRKLRILAIVLIALSIILIAG 
GWLLTVAI PGLSSVI SSPAGMGACALGCVMLALG I DVLLKKREVP IVLASVTTTPGTGS 
PR&S-SS ISGADSTIRSLPTYLLDEGHPQSMRKLRILAIVLIVFS I ILIASGWLLTVAI P 
GLSSVI SS PAGMGACALGCVMLALG I DVLLKKREVP IVLASVTTTPGTGSPRSGIS I SGA 
DST^HSLPTYPLDEGH PQSMRKLRI LAI VLIVFS 1 1 L IASGWLLTVAI PGLSS 1 1 SSPA 
EMGA£;ALGCVMLALG I DVLLKKREVP I WPAP I PEEWI DDI DEES IRLQQEAEAALARL 
PEE&SAFEGYIKVVESHLENMKSLPYDGHGLEEKTKHQ I RWRSSLKAMVPEFLDI RRIF 
EEEEFFFLSARKRLIDLATTLVERKILTEQLERNNLRKAFSYLYQDS IFKKI IDNFEKLA 
V^FMILSKSICRFTIIFF^HEHGVAKSLLHKNAVIXEKVIYRSLOKSYRDIGMSSAKMKI 
LHGNPFFSLEDNKKTIMKEHAEMLESLSSYRKWI^LSDENVVDTPSDPKKWDLSGIPCR 
DALSEISRDEQWQKKAHLKHQESLYTQARDRLTDQSSKENQKELEKAEQEYISSWERVKK 
FE I ERVQER I RA I QKLYPN I LE REE ETTGQ ETVTPTVQGTT AS SDLTD I LG R I EVS SRED 
■ NQNQESCVKVLR SH EVEMS WEVKQE YG PK KK E FQDQMGS LER F FT EH I E ELEVLQKDYSK 
HLSYFKKVNNKKEVQYAKFRLKVLESDLEG ILAQTESAESLLTQEELPI LATRGALEKAV 
FKGSLCCALASKAKPYFEEDPRFQDSDTQLRALTLRLQEAKASLEEEIKRFSNLENDIAE 
ERRLLKESKQTFERAGLGVLREIAVESTYDLRSLTNTWEGTPESEKVYFSMYU^YYNEEK 
RRAKT R LVEMTQ RYRDFKMAL EAMQ FN EEAL LQ EELS IQA PS E 

CPn_0008 10780 11685 

No robust homolog present m Genebank/EMBL as of 11/7/98 

CKYSYLLNYPPPPRRSLGVSCSKLRSLSITLLVLGVLLLTLGIPGLT^GISFGAGLGFSA 

LGGVLVISGLLFLLVRREVPTVRSEEIPRGVSVTPSEEPALEKAQKEPETKKILDRLPKE 

LDQLDTYIQEVFACLERLKDPKYEDRGLLTEAKEKLRVFDWEKDMMSEFLDIQRVLNEE 

AYWEHCQDPLENIAYEIFSSQELRDYYCAGVCGYLPSGDARADRLKRSVKEVMDRFMRV 

TWKSWEASVMLDHSYGVARELFKKAVGVLEESVYKILFKSYRDAFYECEKAKIQRDGRFK 

WL 

CPn_00(J9 1 168** L3110 

No robust homoloq present in Genebank/EMBL as ot 1 1 / 7 / '1 B 
r/rSAHAEORKRD ENOCWEDLKOT IFWVCEI IDCTDE CTVRKiXTMWLDRYADKF ILREKEEK 
MERHELF11ATMVRKA!]r,l!AYAKAKAAFEKERSNEN0PKVKDVEKWLSKGLAEFRNQESRR 
ARERLRELOTLYPEVGVEERVLERORTKKVNLENLYADIEKKYHIICVRCQEHYWKEVENK 
EAEY R E N( ; E K V r >: : AE E V : ; HC LQ R L CDC L LTW.'J k k l r KAE es V f em K F DAT ek lgnkvls d 
VrNRLErUTEDAEEMrrRIEEtE^LRMVELPLL.FMKNTFEKASLOYNSCKEMLAKVEPQ 
r^ESPTYR:*:;c i FHr,ERL^0r)L J iJTAVTNr0ER[,CK';r f JOLEGKVRTCr! Dl ILREOMKHFET/QG 
I.NFrNFKLI,WV(;Ar-:i,FTOARLDLVATVPYMCFYLOY!(N[KREKVR:;OWMAKTERYRErRQ 
Ai-Vj( IVMKEDLLA KDT I LKFKD\ WlXRDDWLLRDERKNRORRf . I TNK [AAAOOPVKGF 

<:i'n_<)0lu L ) l.M U i2 c > 

Nn robust h(tin<»[i>fi prosenr m On^bank/EMDL .v. or lL/7/')8 



CKYFYLR3YPPP PQHSVOG I "SFSKLRVLA ITFLV FOMLLL I .7CALFLTLC IPGLSAAIS 
FGLG IGLSALGGVLMISGLLCLLVKREI PTVRPEE r PEGVSLAPSEEPALQAAQKTLAQL 
PKELDQLDTDIQEVFACLRKLKDSKYESRSFLNDAKKELRVFDFV/EDTLSE I FELRQ IV 
AQEGWDLNFLINGGRSLMMTAESESLDLFHVSKRLGYLPSGDVRCEGLKKSAKEIVARUM 
SLHCEI HKVAVAFDRNSYAMAEKAFAJCALGALEESVYRSLTQSYRDKFLESERAKIPWNG 
H 1TWLRDDAKSGCAEKKLGMPRNVGRNLGKQSFG 

rPn_0010 I 14268 1=5746 

* : . t m* ■ i! ' ' * w i* " 

^" pap 1 ! 3 W".'" 7r :-'.\v ! "•<""''■:. : r ? ' ' -iwvr'':,^!"^ .: "Kr™ 

FLKRl^RKCALAKTTFEKKR^KKWLQAVEEANARRiKWRDWY'DQ 

YPEVSVS IRENKIQETRSNLEKAYEAIEENYRCCVREQEDYWKEEEKREAEFRERGNKIL 
SPEELESSLEQFDHGLKNFSEKLMELEGH ILKLQKEATAEVENKI LSDAESRLEI VFEDV 
KEMPCR I EE I EKTLRMAEL PLL PT KKAFEKACSQYNSCAEML EKVK PYCKESLAYVTSKE 
RLVSI^EDLPJ^YTECQKRFG^DSGLESEVRACREQLRERIQEFETQGLDLVEKFrjrVS 
S RLRNT ECDCVSGVKK EA P PG KK FY AC YYD E I YRVR VQS R WMTMS E RLR EGVQ ACNKML K 
AGLSEEDKVLKEEEYWLYRSERKNXEKRIACTKIVATC^ 
DKARFL FNREDH S 

CPn_0011 15377 16614 

gatB- (Pet 112) Glu tRNA Gin Amidotransf erase (B Subunit) 
FWYSIMTAAPAILHVSPTPPEETKFVI PKDSKSRALG ITLLWG I LLWCGAIVLSGVI S 
GLSALI VCGLGISTI SLGWLFVLGL I LLLRKRELTLEQ I EAKQ I AETFADELKELEMY I 
QSTEKSLEKIEGSRYSDQGFLNRATQKILDLESSLSSITSEFRDLRQLFDEEKIELLSGE 
RLLEFIAANLFKC^RDVYL^^^NIADIRAY^^GP^INYKVAMVI EKAKAWHEF IVLTTMAR 
ELEFFF 

CPn_0012 16596 18212 

gatB- {Pet 112 ) Glu tRNA Gin Amidotransf erase (B Subunit) 
GIRVFFLKNKYGLI^GMYQENLRIiLERLLYNSVQKSYA^ 

DKEKCAEAEKAFLEQQKILLDYGKS I FWIJ^ENDEINLNDPWSWGLNTVRTRKVFQEVDDS 
ERWNHKVLIQKLEDDYEKLLEESSKESTEANKKLLSDLVDRLEDAKTKFFLKKQEEVETR 
VXDIJIARYGGTVDPKQDTEAKKKVELEAS LET FLDS I ESELVQCLEDQD IYWKEQDVKDL 
ARTQELEEQDIEAKREEAAEDLRSIjNERLKKSKTMLDRAKWH IENAEDS ITWWTSQIEMK 
DMKARLKILKEDITSVLPEIDEI ETCLSLEELPLLTTRELLTKSYLKFK ICSETLLKMTS 
VFF2>JNIYVQEYEVQI^NLGFKLCOISQRFGKKQDDFANLEEQVALCKKRL^ 
GFNFMKFJ!)FKAAAKDLYIRSTAEQKMNTDVPCMZLFRRYHEEVNKPLIiEI^^ 
AKKiCI^SIJlU^EKELI^KEIKKEEFYQKKC^RHADRSRHTT^ 

CPn_0013 18509 21106 

pmp_2- Polymorphic Outer Membrane Protein 

LRDRI^FIYI^YWKESPLR£KiCVVMKIPLRFLLISLVPTLSMSNLU3AATTEELSASNS 
FDGTT STTSFSSKTS S ATDGTNYVFKDSWI ENVP KTG ETC; stsc F KNDAAAG DLNF lgg 

gfsftfsnidattasgaaigseaanktvtlsgfsalsflkspastvtngl^ 
lld>tokvliqdnfstgix^aincagslkiannkslsfignssstrggaihtknltlssgg 
etlfqgntaptaagkggaiaiadsgtlsisgdsgdi ifegnt igatgtvshsaidlgtsa 
k italraaqght iyfydp itvtgsts vadaln ins p dtgdnkeytgt I vfsg eklt eaea 

KDEK^TSKLLQNVAFK>K3TVVLKGDVVL sangfsqdanskl IMDLGTSLVANTES I ELT 

nleini dslrngkkiklsaataqkdi ridrpwlai sdesfyqngflnedhsydgileld 

AGKD IV I S ADSRS IDAVQS PYGYQGKWT INWSTDDKKATVSWAKOS FNPTAEQEAPLVPN 
LLWGS F I DVRS F QNF I ELGTEG APY EKRFWVAG I SNVLH RSG RENQ RK F RHVSGG AWG A 
STRMPGGDTLSLGFAQLFARDKDYFMNTNFAKTYAGSLRLQH DAS LYSWS I LLGEGGLR 
EILLPYVSKTLPCSFYGQLSYGHTDHRMKTESLPPPPPTLSTDHTSWGGYVWAGELGTRV 
AVENTSGRGFFQEYTPFVKVQAVYARQDSFVELGAI SRDFSDSHLYNLAI PLG IKLEKRF 
AEQYYHWAMYSPDV^RSNPKCTTTLLSNCGSWKTKGSNIARQAGIVQASGFRSLGAAAE 
LFGNFGFEWRGSSRSYNVDAGSKIKF 

CPn_Q014 21365 21922 

pmp_3- Polymorphic Outer Membrane Protein 

IQNQS I YFTMKS S F PKFVFST FA I F PLSM I AT ETVLD S S AS FDGNKNGNFSVRESQEDAG 
TTYLFKGNVTLENIPGTGTAITKSCFNNTKGDLTFTGNGNSLLFCTVDAGTVAGAAVNSS 
VVDKSTTFIGFSSLSFIASPGSSITTGKGAVSCSTGSLSLTKMSVCSSAKTFQRIMAVLS 
PQKLFH 

CPn_0015 21835 24174 

pmp_3-PMP_3 (frame-shift with 0014) 

LEFDKNVSLLFSKNFSTDNGGAITAKTLSLTGTTMSALFSENTSSKKGGA IQTSDALT IT 
GNQGEVSFSDNTSSDSGAAI FTEASVT I SNNAKVSF I DNKVTGASSSTTGDMSGGAICAY 
KT S TDT KVT LTG NQMLLFSNNT STT AGGA I YVKKLE LASGG LT L F S RNS VNGGT AP KGG A 
IAIEDSGELSLSADSGDIVFLGNTVTSTTPGTNRSSIDIjGTSAKMTALRSAAGRAIYFYD 
PITTGSSTTVTDVLKVNETPADSALOYTGNI I FTGEKLSETEAADSKNLTSKLLQPVTLS 
GGTLSLKHGVTLO/rQAFTQQADSRLEMDVGTTLEPADTST INNLVINI S S I DGAKKAKI E 
TKATSKhJLTLSGTITLIJ5PTGTFYFJJHSLRWPQSYDILELKASGT\^STAVTPDPIMGEK 
FHYGYQGTWGP IVWGTGASTTATFNWTKTGYI PNPERIGSLVPNSLWNAF IDI SSLHYLM 
ETANEGLQGDRAFWCAGLSNFFHKDSTKTRRGFRHLSGGYVIGGNLHTCSDKILSAAFCQ 
LFGRDRDYFVAKNOGTVYGGTLYYQHNETYI SLPCKLRPCSLSYVPTE I PVLFSGNLSYT 
HTDNDLKTKYTTYPTVKGSWGNDSFALEFGGRAPICLDESALFEOYMPFMKLQFVYAHQE 
GFKEQGTEAREFGSSRLVNLALPIGIRFDKESDCODATYNLTLGYTVDLVRSNPDCTTTL 
RISGDSWKTFGTNLARQALVLRAGNHFCFNSNFEAFSQFSFELRGSSRNYNVDLGAKYQF 

CPn_0016 243S3 26188 

pmp_4- Polymorphic Outer Membrane Protein 

RSDFALKRGCHMRSSFSLLLIS.T3LAFPLLMSVSADAADLTLGSRDSYNGDTSTTEFTPK 
AATSDASGTTY I LDGDVS I SCAGKQTSLTTSCFSNTAGNLTFLGNGFSLHFDN T I SSTVA 
GVWSNTAASG ITKFSGFSTLRMLAAPPTTOKGAIKITDOLVFEt! IGNLDLNENASSENG 
GAIOTKTLSLTGSTRFVAFI^NGSSOCGGAIYASGDSVISENAGrLSFGNNSATTSGGAI 
3 A EGNLV I S NNQN I FF DGC KATTr IOG A I DC N K AG AN P D P I LTLSG IM ES L H F LNNTAGNSG 
OA I YT K K LV LSSG RCG VL F N N KAAN AT PKGG A I A I LDSGE I GTSADLCJN 1 1 FEGNTTST 
TGO PA.^V1*RNA I DLASNAK FLNLRATRG* IK7 1 FYDP ITS:>GATDKLSLNKADAGSGNTY£ 
GY rVF^CJEKLSEEELKKPDNLKJTPrOAVELA/vGALVLKDf JVTW \NT rTOVEGSKWMD 
( ^TTrEiXGAKTVTLNf It A I N I DCLDGTt JKA \ I KATAA: IK DVAL.lt' r I MI .VPAOGNYYEHH 
NL'XH.^VE- PL E t;i>f>A0f JTMTTTL I I 'DTI' I f .NTTNIIYGY^X JNWNNt'I iiRI'< 'Nt'KNKKCYLN 
LD 

' f'M_(JE) t / ^C^f ^ ■/]/() 

pmp_4 ['M['_l ( ! r.imc -'.h l t r wfh 00 \u) 

llr ETMGiK<;n;i rvwvnnvrxKTKNATi.'rv/rKTYiYKi'NPnpfji ;n,vpM:;i-W[;:;rvnvRS 
ro , :f.Mijps'r:.:;[u;:::;Miif.wv:.wiAijfa.HKr>jKi;NOf< , ;Yi<H:: r ;Ac;YAUX;(;F}''t'A:;ENFFN 
faiv vL.^;Yi)Kn(iLVAKNirnivvA^AM.;/i'Mi.t;F':K , t , E,AKir.: :< iN.uj'ir.f 'Pvi-'Marfayg 

HThNNMTTKYT(;Y'lfVKfT;w:NL/.|v;if < V IA T I'VVA: ;( IKR "IWVI^rHTI'FLNtiKM T YAHQ 
NDFK VM ;TKC \\* : ' FOM-M .FNUY/f'' A, 1 Y"\ \Y.V TjK'ITYDI .. ! LAYVPIjV f RNfit^JC'T'lTLM 
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Vr:GDCW.Tr<:GT.'>LSRQALLVRACasiHHAFASNrE'/F^QFEVELBGS3R3YAIDLGGRPGF 

CPn_O0ia 275 Lj 2 900 3 

pmp_5- Polymorphic Outer Membrane Protein 

ETfWKTSVSMLLALIXSGASSIVLHAATTPL^PEDGFra 

LTGEVLY I D PCKGCS ITGTCFVETAGDLTFIXINGNTLKFLSVDAGAN I AVAHVQGSKNLS 
FTDFLS LVITES PKSAVTTGKGSLVS LGAVQLQD INTLVLTSNASVEDGGVIKGNSCL IQ 
G T KNSA I FOQNTSSKKOGA I .TTTQGLTT E^NLGTLKFNENKAVTSGGALDLGAASTFTAN 

1 , : "> 'NKT.'/ ui ** * ' i> * ;a if J' .«.:.. ■■TE.r.T . ^:-v« n r^w.r'Vh'^ ;TnT[.;iT^:-3 
■: irr/i< ;MT'X7>r'- ,;\L.;,vvi.i :: v/^^.r ,: ,hw*"nATPLf> -a rrrrrr'^vr.or.FTC 
^XjDIVFEGNQVTTTAPNATTKRNVIHLEGTAKWTGLiAASGGNAIYFYDPITT 
LR I NEVS ANQKLSGS I VFSGER LSTA £A I AENLTS R I NQ PVT LVEGSLVLKQGVTL ITQG 
FSQEPESTLLLDLGTSL 

CPn_0019 29007 30356 

pmp_5-PMP_5 ( frame-shi £t with 0018) 

astedivitnlsinadtiygknpinivasaanknitltgtlalvnadgafyen^ 

dysfvklsfgaggtiitqdasqkplevapsrphygyoghwwovipgtgtqpsqanlewv 

rtgylpnperc^slvpnslwgsfvdqraiqeimvnssqilcqergvwgagianflhrdki 

nehgyrhsgvgylvgvgthafsdatinaafcqlfsrdkdywsknhgtsysgvvfledtl 

efrspcgfttdssseaccnqwtidmqlsyshrnirom 

g attyyypnst flfdyys p flrlgctyahqedfketggevrhftsgdlfnlavp tgvkf e 
rfsdckrgsyeltlayvpdvi rkdpkstatlasgatwsthgnnlsrqglqlrlgnhcl in 
pgiewshgaielrgssrnyninlggkyrf 

CPn_0020 32717 30603 

Predicted OMP fleader (14) peptide: outer membrane] 

KLWSN PNLRLMKRC F L F LAS FVLMGS SAD ALT HQ EA VKKKNSYLS HF KSVSG IVT I EDGV 

Lf ^ I HNNLR I Q ANKVYV ENTVGQ S LKL VAHGNVMVNY RAKTLVC DYLEYYEDT DSC LLTNG 

RFAMYPWFLGGSMITLTPETIVIRKGYISTSEGPKKDLCLSGDYLEYSSDSLLSIGKTTL 

RVCRI P ILFLPPFSXMPMEI PKPPINFRGGTGGFLGSYLGMSYSPrSRKHFSSTFFLDSF 

FKHGVGMGFNLHC SQKQVP ENVFNMKSYYAHRLAI DMA£LAHbRYRI*HGDFCFTHKHVNFS 

GEYHLS DSWETVAD IF PNN FMLKNTG PTRVDCTWNDNYFEGYLTSSVKVNSFQNANQELP 

YLTLRQYPISIYNTGVYLENIVECGYI^AFSDHIVGENFSSIJIAARP 

TLSSTLGSSLIYYSDVPEISSRHSQLSAKI^LDYEFLLHKSYIQRRHIIEPFVTFITETR 

PLAXNEDH^YIFSIQDAFHSLNLtJ^AGIDTSVLSKTNPRFPRIHAKLWTTHIL 

FPKTACELSLPFGKKNTVSLDAEWIWKIfflCWDH^ 

I KC DRENF I LDVS RP IDQLLDS PLS DHRNL ILGKLFVRPHPCVINYRLSLRYGWHRQDTPN 
YLEYQMILGTKIFEHWQLYGVYERREADSRFFFFLKLDKPKKPPF 

CPg|0021 34470 32707 

Predicted OMP [leader (19) peptide] 

CSaiPYPNI EILARGVEHRSMGLFKLTLFGLLLCSLPI SLVAKFPESVGHKILYISTQST 
QQAXaTYLEALDAYGDHDFFVLRK I G EDYLKQ S I HSSDPQTRKSTI IGAGLAGSSEALDV 
LSOAMETAD PLQQLLVLS A VSG HLGKTSDDLL FKALAS P Y P V I RLEAA YRI^WLKNTIO/I 
DF&tfSF IHKLPEEIQCLSAAIFLRLETEESDAYI RDLLAAKKSAIRSATALQ IGEYQQKR 
FL^P^RM^TSASPQDQEAILYALGKLJtDGQSYYNIKKQL^ 

EEJ^PVIKKQALEERPRALYAI^LPSEIGIPIALPIFLKTKNSEAIOJ^ALAIiEl^ 
DTPKLL EY ITERLVQPHYNETLALSFSKG RTLQNWKRVN 1 1 VPQDPQERERLLSTTRGLE 
ECjILTF LFRL PKEAYL PC I YKLLASQ KTQ LATTAI S FLSHTS HQ EALDLLFQAAKL PGE P 
II^YADIAIYNLTKDPEKKRSLHDYAKKLIQETLLFVDTENQRPHPSMPYLRYQVTPES 
Rlf^|^DILETlATSKSSEDIRLLIQIJfr£GDAi(NFPVI^GLLIKIVE 

CP&np022 35042 34395 

maf" " 

TI3LQVI SNCCNVSNTRSFYSMSLPLVLGSSSPRRKF ILEKFRVPFTVIPSNFDESKVSYS 
GDP^AYTQELAAQKAYAVSELHSPCDCI ILTGDT I VSYDGRIFTKPQDKADAIQMLKTLR 
NQITHDVVTS I AVLHKGKLLTG S ETSQ I S LTM I PDHRI ESYIDTVGTLNNCGAYDVCHGGL 
ILK^GCVYNVCGLPIQTLKYLLEELNIDLWDYSI 

CP|yjp023 36657 35014 

yj|K/alr-ABC Transporter Protein ATPase 

EMR^LLYSKQHFVMLSAMSIVLDKIGKSLGTRILFDDVSWFNPGNCYGLTGPNGAGKS 
TLLKI IMGMI EPTRGS ISLPKKVGI LRQN IDS FHDTTVLDCVIMGNTRLWEALQRRDNLY 
LQE^DAIGMELGEIEEIIGEE^YRADSEAEEU^TGIGIPNEMFDKKMAMIPIDLQFRV 
LLCOAL FGH PEALLLDE PTNHLDLY S I NWLGNFLKDY EGTVI WSH DRH FLNT ITTH IAD 
I DYj>J 1 1 1 Y PGNYDDMVEMKTAS REQEKAD I KSKEKKI SQLKEFVAKFG AGS RASQVQS R 
LRElkKLQPQEL KK SN I QR PY I RFPLSDKS SG KWL SLEAI T KDYG DHQVI H PFSLE IYQ 
GDKLGI IGNNGLGKTTLMKLLAGVEAPSSGS IKLGHQAICSYFPQNHSDVLADCGQETLF 
EWLRNRKTG INDQE IRS VLGKMLFGGDDAFKQ IQALSGGETARLLMAGMMLENHNVLILD 
EANNHLDLESVSALSWAINDYKGTAI FVSHDRGLIQDCATKLLI FDKDK ITFFDGTMVDY 
TAGHKQLL 

CPn_0024 37605 36661 

xerC-Integrase/recombmase 

REVMIASIYSFLDYLKMWSASPHTLRNYCLDLNGLKIFLEERGNLAPSSPLQLATEKRK 
VS ELP F S L FTKEHVRMY I AKL I ENG KAKRT I K RC LS S I K S FAHYCV I QK I LL EN P AET I H 
GPRLPKELPSPMrfAQVEVLMATPDISKYHGLRDRCLMELFYSSGLRISEIVAVNKQDFD 
LSTHLIRIRGKGKKERI IPVTSNAIQWIQIYLNHPDRKRLEKDPQAIFLNRFGRRISTRS 
IDRSFQEYLRRSGLSGHITPHTIRHTIATHWLESGMDLKTIQALLGHSSLETTTVYTQVS 
VKLKKQTHQEAHPHA 

CPn_0Q25 38610 37b84 

elaC/atsA-Sulphohydrolase/Glycosul fatase 

I LMSS REL I I LGCSSQQ PTRTRNQGAYLFRWNGEGLLFD PGEGTQRQF I FAN I APTTVNR 
rFVSHFHGDHCLGLGSMLMRLNLDKVSHPIHCYYPASGKKYFDRLRYGTIYHETIQWEH 
PISEECIVEDFGSFRIEAQRLQHQVDTLGWRITEPDTIKFLPKELESRGIRGLriQDLIR 
DQEISKjGSTVYLSDVSYVRKGDSIAIIADTLPCQAAIDLAKNGCMMLCESTYLEQHRHL 
AEJflFHMTAKOAATLAKRAATQKLILTHFSARYLNLDDFYKEASAVFPNVCVAOEYRSYP 
FPKNPLLNK 

<:Pn_OtJ*J*i V»h37 W!f>2 

::NFAM:;fII,[ P::LHtJ.';VTSYFHKPQP EKOA-\PSKJ Ii<L [f'N [ AYL [ [ [OVL.VYVYLVGAML 
< ■MFUr':;v;r E l L/;t,:;';r,ALLVLL:: I I'NPCL [WW r "iTKKTKF. [APKDA.;E:JUPTK.';A:;RKGn 

!'gl,:;[1!ltIJtlKl'K^lFrRT0[.F.KGVNYVTNKFKS^;^x:;^'lu^Dl■:!^^*:pRQ:;KP': r ;E[Es:JDR 

: 'EE JfPKAKi !KAPi tTATl'KESKTl'TTn UKKKKKTKI I^LHRTTC:; rtlKRIAPKPMVPSK 
KKKtVr.l.KKTVl'LI-rEIH.RHO:;:* ^NF^SOSS'IPPPVQPKA E LPWprK<JPTDP 

<:i*ii_o(j::7 4.:."ij m/'m 'MT^ r< ^' 

Ion [,.jfi ATI' 'l-ip^rniiMir PrnrtM-.f 
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PS IRTIVDSTTNSDSPILDPNPEJPVEKLLDE3EEE5EDQSTERLLPSELFILPLNKRPFF 
PGMAAP I L I ESGPYYEVLKVLAK:;: ;QK Y I GLVLTKKENAD I LKVSFNQLHKTGVAAR ILR 
IMPIEGGSAQVLLSIEERIR t 1 EP [KDKYLKARVSYHADNKELTEELKAYSISIVSVIKD 
LL KLNPLFKEELQ I FLGHS D FT E PGK LAD FSV ALTT ATR EELQEVLETTNMH DR I DKAL I 
LLKKELDLSRLQSSINQKIEATITKSQKEFFLKEQLKTIKKELGLEKEDRAIDIEKFSER 
LRKRHVPDYAMEVIQDEIEKLCTLETSSAEYTVCRNYLD^TIIPWSIQSKEYHDLXKAE 
rvU«DHYGLDEIKQRILELI3VGKLSKGLKGSirCLVGPPGVGKTSIGRSIAKVLHRKF 
FRFSVGCMRDEAE IKCHRFTY IGAMPGKMVQALKQSQAMNPVIMIDEVDKIGASYHGDPA 

■• vl'f. /[./^ •• ■ \ 'VI'. 1 iA . ''M ' \'\"-*.\RE,VA P^r^i^MrKKVLPKa'- 

LK I VQNQEKPKSKK I T F K I L o KN UJ V\ LG K P I F J J D R F Y EST PVGVATG LAWTS LGG AT L 
YIESVQVSSLKTDMHLTGQAGEVMKESSC IAWTYLHSALHRYAPGYTFFPKSQVHIHI PE 
GATPKDGPSAGITMVTSLL3LLLETPW^I^MTGEITLTGRVI£^^ 
LNILIFPEDNRRDYEELPAYLKTGLKIHFVSHYDDVLKVAFPKLK 

CPn_0028 43323 42543 

No robust homo log present in Genebank/EMBL as of 11/7/98 
RMFLQFFHPIVFSDQSLSFLPYLGKSSGI I EKCSN I VEHYLHLGGDTSV I ITGVSGATFL 
SVDHALPISKSEKI IK ILSY I LILPL ILALFI KIVLkl I LFFKYRGLI LDVKKEDLKKTL 
T PDQENLS L PLPS PTT LKK I HALH I LVRSGKTYNEL I QEGF SFTK ITDLGQAPS PKQD IG 
FSYNSLLPNFYFHSLVSVPNISGEERALNYHKEG^EEMAVKLJCTMQACSFVFRSLHLPSM 
QTKDKKAGFGLLTFFPWKIYPL 

CPn_0029 43839 43390 

No robust homo log present m Genebank/EMBL as of 11/7/98 
SNKNERNENI YCFNLFRY I RFF AALN I RMNDGLRFC Y SY I LLRPMLLDS S LL RXGGQELL 
KKFQIKLRTTSIKSSLISLRQQLGKREATQSDILYGTSRFQYLNSFEIEDPRIPPTMAAQ 
LQEITWSRSVMELKI KFYVYLNSERNKTKP 

CPn_0030 43840 44529 

gcp-O-Sia log lyc op r o t e i n Endopep t ida s e 

I^GVCWYSLFFYIKNRRMYFYKYVIIDTSGYYPFLACVDNMVLE^SLPVGPDL^ 

FL FKSKNLS FQGVAVALGPGNFS ATR IG ISFAQGLAMAKNVPLLGYSSLEGYLLSKDEKK 

AIJ^PLGKRGGVLTLSSEIPEXGLNEKRRGVGPGALLSYEi^SDYCVAHGyYHVISPN 

LFASSFSDKITVEWAPSVBQIRRHVISQFMFVEYDKQLSPDYRSYSCIF 

CPn_0031 44708 44884 

rs21-S21 Ribosomal Protein 

CMPSVKVRVGEPVDRALRI LKKKI DKEG I LKAAKSH RFY DKPSVKKRAK SKAAAKYRSR 

CPn_0032 44881 46098 

dnaJ-Heat Shock Protein J 

SLIGNVVWGSVSGMDYYSILGISKTASAEEIKK^YRKIAVKYHPDKNPGDAAAEKRFKE 
VS EAYEVLSDPQKRDSYDRFGKLX3PFAGAGGFGGAGGMGNMEDALRT FMGAFGGEFGGG S 
FFDGLFGGI^EAFGMRSDPAGARQGASKKVHINLTFEE^^GVEKELWSGYKSCE^ 
QG AVNPCG I KSC ERC KG SGQWQSRG FFS MAS TC P ECGGEGR I ITDPCS SCRGQGRVKDK 
RSVHVHIPAGVDSGMRLJCMEGYGDAGQNGAPSGDLYWIDVESHFWERRGDDLILELPI 
G FVDAALGMKKE I PT LLKT EGS C RLTVPEG I Q SGT I LKVRNQGF PNVHG KGRG DLLVR I S 
VETPQNLSEEQKELIJ?TFASTEKAENFPKKRSFLDKIKGFFSDFTV 

CPn_0033 46129 48171 

pdhA&B/odbA&odbB- (pyruvate) Oxoisovalerate Dehydrogenase Alpha 
& Beta Fusion 

ERSMGWQNQVI SSI RDVL KLVWELRF AEH KMLLLS RQSG SGGT FQ LS C AG H ELAGVLAG 

KSLIPGKDWSFPYYKDQ^FPIGLGCDLSEIFASFLARTTPNHSSARMMPYHYSHKKLRIC 

CQSSWGTQFLQAAGRAWAVKHSSADEWWSGGDGATSQX3EFHEMLNFVALHQLPLITV 

IQNNHWAISVT'FEDCCGADIASLGRCHQGLAWEVDGGNYTSLTCT 

AL ILI DVVRLSSHSNSDNQEKYRSALDLKLSMDKDPLILLEKEAINVFGLSPFEI EEIKA 

EAQEEVRKSCEIAEALPFPSKGSTSHEWSPYTETrLIDYENSESAQNLRNSEPKVMRDAI 

S EAL VE EMTRDSGV I V FG E DVAGDKGGVF GVTRNLT EK F G PQ RC FNS PLA EAT 1 1 GTA IG 

MALDGIHKPWE IQFADYIWPG INQLFSEASS IYYRSAGEWEVPLVIRAPSGGYIQGGPY 

H SQ S I EGFLAHC PG I KVAY PSNAADAKALLKAAI RDPN PWF LEHKALYQRR I FSACPVF 

SHDYVLPFGKAAIVHPGKDLTI VSWGMPLVLSLEVAQELASRGI S IEVI DLRTMVPCDFA 

TVLKSLEKTGRLLVIHEASEFCGFGSELVATMSEQGYAYLDAPIRRLGGLHAPVPYSKVL 

ENEVLPHKES I LQAAKSLAEF 

CPn_0034 49496 48210 

. CT345 hypothetical protein 
VNFLLPTTCRGILMAEISTPSLPDSS IVSQKTPPVPDPDSSPDHIPTI PTQAPFKPQRKK 
ETPSSrVNAIAFAILAFLSCLGGVFAICLGCSLEITMPLFILTAVFIAFTLLYFIHYLEK 
PKIPEPLPTPPPSPTLRAPTLTPEIPAPAPGIPLPPTLPKVDRTKLTCNPDIHYPSTYDP 
KACFSLLKQLFSLDPETRPEDRKYSNKLASILLRSKEKSGFRFHCFKGHFSHDKILNKKS 
GAWI S SHS SMD FSTTLGRAFAVTTC LQR SCWEK I KNN I PT P EfCHL P IGSCVSGPWDVEE 
GAQLYTSHLIVINPPTLETLIKEKMRRAITLKDFSMKEAFTNLVLiAYLQCFDICIEHNLE 
SVQLEVFGLNNLSADQEEFTTWESCCHLALLESVRILLASKEEYALSNVSVNSISQVPLQ 
TACRALFLN 

CPn_0035 51146 49569 

CT3 39 Hypothetical protein 

ARTT L EEDAGS S LK PL PKTF PC ATAL Y I T HRRE RKS EHQMWNRCQV FS S F F F RY P I S SW L 
I RLRASCECFQQRHP I FLCGLYWLAG ITSRGH PECSAL IL I FLGMFLPRNPKQWLPLASA 
WI ISLMLTPAPFLHDGPISGTFVIHHAGGCGTYYGEALCICTPCGKRAHHLSCQILSESR 
LELKKVfELEGTLHHTSQIVFKSNACYKEIPRSRFYIMKEKCRESSCHFLNHRFPSSEVG 
PFASSLLLGTPLPQNLRDLFRQKGLSHLFAISGWHFSLCATTLWMLCALLPLKIKKILSF 
[VLTSLACIFPMSLSVWRSWIS'-/TLLCFSWCFCGSC3GLHRLGAGFILCSIFFSPFSPTF 
VLSFLATLG I LLFFPK I FSFLYTPWTQFLSPFWLYP I R Y LAMTLAISLSAQLF IVL.P IMQ 
YFOSLPLEGLLYNLTVPFTILPI IVFLIATIILPCCSPITEALIQGFLSHPWLHNPNILK 
TLSFAPVPPWMLTIjAijLILFFIGILRTtWSPYASISATjYRFIETL 

(*Pn_0U^, 50 l >b r J bl79b 

rTUH hypor h^-r ir.i L prorom 

AKiUjWr/.CRKKMKKPDND^'rFDVRSFFPFDVLClEQLPKEMriWEVV.^AK t PPLPRGWYEL 
MGL^K-DREDFf LDlMCl'VLC I EHKE.^PS TCPFF'jLLCT I EVY IYP L.EKFPYQLKMFWF 
RDORl '( rro* li'-l'PIiLni'I ■« il IHRLPPLTIDPHYEKFFf] IHfFJFGKWrifJEf 1 1 FPMP'JLAKVOQK 
1 -f'L^t.VVMNKMOAKDNt 'V'l /.J [ V\ J FY r ,YKEPFAYQCFF"DPR I RPULE" M'MVLLNIIC'JLE 

iiR , -t,i-Trr[,LHi.:;K:.YYP-:FL..;wr.r , jr^ui:;Env"^NE 

( PtL_00 ;} m !()/ '>2 L l r > 

pr -II [•■[■■; I'hn' ptiixMi i i>-r i j rot-»in Upr 

KLE/:i*! I ' 'IA<A I I LL.i'SRt IWRT I / X VP PC I MNEPTRTYLE.1EKDTQDO r FCE/jATi ' [7KN 

^A^;rHv^*PAl ;viVRi.n>c;i-;n T//nfTYAGKTtMAKS [m^j n^Mi-ziAPOt llvt[r:;kea 
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■ IP I HJK ['JDAFSSCFGEL 

; ;pn_ooiH 'jjLr> siflu 

CT3 1H hypot net loi I protein 

MDTQSSTGNEEWR [ACTS IV3GMALCKVFFLGT3PLHVRELTLPQEET/EHEIHRYYKALN 
Ro K3 D I VALEQEVTGQCGLQEV3S j LQAHLE I MKDPLLTEEWNTIRKDRKNAEYVFSSV 
MCKIEEGLTAVRGMP'JWDRVQDrHDISNRVIGHLCCQHKSSLGESDONLriFSEELTPS 
FVASAN^AY IRGFV^LVGAAT^HTA IV3RAKS I PYLANI SEELWNIAKPYNGKLVLIDGY 

:-tj; at: .... //-:['" ji' l % tv " :i<.v , *c^7"r'[ l. rr^c 

. : : ;!,:•■;■:. av : i, : ; , '..*'T •'"'[ A"f . 1 *vr.i ■ .f ' . '^e*'!' ; fiKKE 
R3 CRWLLDY3VIljEDQLQAIAKA3L(0GSIKVLIPGVSDVSEI IEVKKKWETIRTRFPKGH 
KVSWCTMI EFPSAVWM IEE I LPECDFLS IGTNDLVQYTLG ISRESALPKHLNVTLP PAVI 
RM I HHVU3AAKQNQVPVSICGEAAGQL3LTPLF IGLGVQELSVAMPVINRLRNH IALLEL 
NSCLEITEALLQAKTCSEVEELLNRNNKITS 

CPn_0039 54256 53963 

CT339 hypothetical protein 

I SMGSG YAKKKKEAK I MEQQFL EMEASLLEKRYEGQAGNG LVSWINGKCDL I SVKVQ PT 
CLDPEDPEVI EDLFRAAFKLAKEQMDQEMSLMRSTMPF 

CPn_0040 55673 54318 

dnaX-DNA Pol III Gamma and Tau 

AFYTHSIXTYTMTLQPYQASSRKYRPQIFREILGQSS^/VAVLKNALV^ 
GTGKTTLAR ILAKALNCVHLSEDGEPCNQCFSCKEI ASGSSLDVLEIDGASHRGI EDI RQ 
rNETVLFTPVKAK FK I Y I IDEVHMLTKEAFNALLKTLEEPPOKVKFFFATTE IHKI PGTI 
LS RCQKMHLQRI PEKT I LEKLS LMAQDDH I EASQEALAP I ARAAQGS LRDAESL YDYVT S 
LFPKSLSPDTVAQALGFASQDSLRTLDNAILQRDYATALGIVTDFLNSGVAPVTFLHDLT 
LFYRNLLLTNSTTSKFSSQYKTEQLLEI IDFLGESAXHLQNTIFEQTFLETVI IHIIRIY 
QRPVLS ELI SS IKSRQFEGtJWIKEPTLTCXJVSAPOPQPTYKEQSFLEKKNQPAAEGKI I 
SVEVKS S AS I KS AAVDTLLQ FAWEFSG I LRQ 

CPn_00*41 55888 57342 

No robust homolog present in Genebank/EMBL as of 11/7/98 

CKYLYHHSYPPPQHSVGSISSRYKIJRVI^ITFLVLGVIJJjISGALFL^ 

GLGIGLSALGGVLWSGLLCLLVKREVSKVCPEEIPAVQPEETPEG 

QKEQKTQKII^JQLPQELDQLDRYIQEAFACLGPLKDIJCYEIX^FL^ 

DM I AE FVELQQ ILCQEGRLLEFVI NOT RY I GRDL FKRED S LYKLWEWLGYLP SGDVRG ER 

LKKSAREVTORFMRTTCNIRKIAOTFDRHVYSVAKTAFEKAFGALETCVYESMR^ 

FCEYEKAKLDGDEEKSAHA£QRFQDIK>niWEDVKDAFFWVKEDGKIEIDDAIGNSCKWSE 

RYEEHR ITRARWYKVAEHQLFNATMRVKDSLR EHNEARVAFE KERS KENQRQVQKKKEKR 

LRTDl|KELHDQELPRAQERLRELQALYPEIAVSWEARREVASD 

CPigl0042 57346 58182 

Nctlasobust homolog present in Genebank/EMBL as of 11/7/98 

E EE EKQ EAE F RENGTK IRS MEEVS EY LQQVENGLESC S KRLT KMET F ALGVRLEAKEE I E 

Sl¥lSDVVNRFEVLCRDIEDMLSRVEEIER>n-RMAELPLLPIKEALTKAFVQHNSCKEKL 

TI^SPY FKE S P AYLTS EERLQS LNQT LQRA YK E SQKVSG L ES EVRAC REQLKDQVRQFET 

CGWLIKEEILFVTSTFRTKFSYHSFRI^PCMRLYEEYYDDIDLERTRARWMAMSERYR 

DAtEQAFQ EMLKEG LVEEAQ ALRETEYWLYREERKSKKKH 

CFfipCI043 58432 60372 

No*!'Xpbust homolog present m Genebank/EMBL as of 11/7/98 
mgf&MQVPLS PQLPP PPPDHSVGAS FCLSKFRVLA I TFLVLGVLLL I SGALFLTLG ISG 
VSL^^LGLSALGSVLVISGFLLLLERREVSGVGLEGIPTGIPVGPSAEPSSEEIQKKQK 
ATCQ 1 LDQL PQELDQLDTDI QHVLSC LGKL KDLKCKDRGLLJtDAKEKLQVFDFVWKDMMME 
FVELQQVMDQESRYLEGLI HEVQS I AHKLFVDDVNIRSHLGESCGYLPS EDVRGELLKRF 
AKl^ARFMKVTRDIRKIAMAFNKNAYGAAKNAFDKAFGSLETCLYKSLTKSYRDTFCDY 
KRA^JLPDENNSARAEQRFREVKDHWEDLKETVFtW 

L I LEKRKDKVMS HQLWEATMRVKEAEVTY SVARVAF EKDGSQQNQKKFQEKTKERLRCLK 
DLRJ^ECHPAQERLEKLTALYPEVSVSWETERERKFNLEKAYGNLEERYQSVVQDQEDY 
WT EQ'KN REAE FRAKGTKVRSMEEVAEHLQ I LENLLEDCYKRLSKAETFALGVEREATEE I 
EYf : ILSDAA^mL^CVLCEDIEDTLPRVEEIE^I^^.RMAERPLHPIKQAFTKAFVQYNRCKER 

LAKVEPYYKES PAYVNS eerlqsldqasqc iorvpkgfkfrngsmyi 
CPrilj044 60278 60778 

Noiijyjbust homolog present in Genebank/EMBL as of 11/7/98 
I AKS DC RVW I RLHSAYKESQKVSS LETEACTY REYLREQVQQ fetqg vs L I KEELLFLS s 
TLKSKLSYDPL I ANI PCMKFYYQYYDDIDKARAQSRWLEKSERYRNAKRRFQEIVKKGLF 
KEAKPLKKE EYRLLQE ERSNKEKRL I YNKMAVARQRVQEF ESME I P E 

CPn_0045 60961 62790 

CT345 hypothetical protein 

CKYTYHPPQLPPDH SVGATSWQPKLRI LT ITFLVLGVLLL ISGALFLTLGVPGLAAGLSF 
GLG IGLSALGGVL WSGLLFFL I RRGVSKVR PEE I PVT PSHEAQKI LCQL PQELDQLDTS 
IQEWSCLGKLKDLKYEDQGLLTEVQEKLRVFDFVRKDMVTEFLELCQWAQEGQFLDYL 
INQVQS I SHKLFVPDVNIGAHLAELCGYLPSGDVRVERLKRSARQWDRFMRVTCDTRKV 
AMAFDENACGVAKNAFDKAFGALEEC VYKSLTESYP EAFYEYEKAK ILRNEDVEWLQDKN 
KSARAEQRFREVKDRWEDLKETVFWVKENGCIDLETv'LTAVGGWPDRGPEHLI PEKRRNKV 
MSHKLWEATMRMKGAEGTYSVARVAFEKDGSRKNQKKFOEKTKEWLRCLKDLHDQECHRA 
RERLAELEALYPEVSVSWETERETKFKLETAYGNLEERYQSWRdQEDYWKEEENKEAE 
FREKGTKVRS PEEWEYLO I LENLS EDCSKQLT I AS-Z-ZVLGVELEATAEFEYT I LSDAAN 
RL KVLC ED IEDILPRVEEIE I MLR I A ELP FL P I KQA FTK A FLQYNSC KDKLAKVEPYCQE 
SVDYKSCFRV 

CPn_0046 62775 63263 

No robuct homolog presenr in Genebank/EMBL as ot 11/7/ 98 
ERFQ^LNODLQNVYQECQKATGLEJEVSAYRDHLREOITEFETQGLDVIKEELLFVSSTL 
KSKLCYDPLIADIPCMKFYEEYYDGIDKARVQSRWLEKSERYRKAKKGFQEMLKEGLFKE 
DQALKKAEYRLLREKRMNKF.KLLrCNKIEAAOORVOEFGP'3D.O 

CfTi_0f)47 M i7 b3b r >J 

No tobii::r homo t ix4 ptosenr in OeneoanW EMBL f is at U/7/9R 
KMFHLKVT tVrU'FKrVL.CC; t LTMYIirOK I PMTLTTC/UVLNKflLKKDYCLWFVYGSCPES 

kvklot:::;hkwl 

i.*tTi_0f)4M t, n^7 nSROL 

•yqtF hs .:ori:;fM vt-cl hypothot u <i I IM pr^r^in 

MKFJJWf;-:L*YNRA[.IIKr.:;ifyWVHYFr.YTFV^r'T!-[VA[rTFAWr.KVLYVPCYKAC3EISRIS 

LTAr-Mi)F::L::w::AiiKFYKRTAHir;EAF(:KVYEiLrL ;k>'>i j [,:;kl ; ;cinadentdywfkkaad 

Kt.L.'-'THFVU:!: IT'JK( 'LKDLC 1 Y PPLLGKCKKTLF T I EI Iw'NKf ;MVI AO^'F^'MLK [FLIQEN 
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CPQPCFDA TMD I LK IANFEVAVDKEMSGCVKGELLCKRC I EK TTKGTP I LEKYQR I DDRD 
AK I LKQLRAQLL3VHTLF3CR3LWGA I FVVLL I LLWGYGALKALCPEMLK3PQRFMLY I A 
tLTLoLLWCPGTEIFCAYWV3YL3YPPILPFTAVLLGYFLGLP rAGFSCTFLALLYTLGS 
DLWWSWFLSINLLC3WRILVSLHRVSRLSSVFWACMKLCX^A^SLLMFRrFTOT 
ALYADGIESFVYSLITAI3WALIPVFEA3FGASTNFSLLTYLSPENALLKRLFKEAPGT 
YQHSVLVGSLAEAAAOA IGADSLYCLVAAHYHD IGKLINPGFFSENQKI LQQSGHSLSPL 
ECAKMIMRH I PECWfLARQAGLPESF IQVIEEHHGTSVrRSAYYSHMVENPSTGSFDEEL 
FRYSGNKPS3KETT I rMIADSFEAASRSLKNASLPDLQRLIDQI IQGKLQDGQFSCSPIT 

Jl'ri_0 04 'J *> vjal ' 

No robust homolog present m Genebank/EMBL as of 11/7/98 

LKEKRRNrVYLLVIYQEIFWl*TMLHQPYYDKILTGNTIYIPGHTHKDSNKLFQKKSRAIW 

VDEKPFSLDCFSNVFLIFVSLVPIAGLVRAYQIKKSLDRTTVQIGYSPSLSCEQKECVEA 

FVNGYGLICISILGGLGILVTILILVVT.SLIXDGILMLFSLSTYESIKNYISKHICW^ 

AT 

CPn_0050 66849 66499 

No robust homolog present in Genebank/EMBL as of 11/7/93 
VSWFPILGI FLAMRYAKHGTNWNDE>TVKANLGYLPSTN I KTAG I L 

GGCGILLPIFT.fJJAII^ISVLFQLIMLPFRI^CFAlaRQSVSSDTVTNLLLI.NNTLA 

CPn_0051 66797 67111 

No robust homolog present in Genebank/EMBL as of 11/7/98 
CFAYLIARNIPR>KjNHETYIHPGVLPSSHAQDVSRSTVYPSRSFIMRRMLMGWNFNRVPS 
KSSEQLMDGHRIPLIFFGKHHPTIS ILNVNRFSWLS I FYNGERGF 

CPn_0052 68008 67304 

heme -Porphobilinogen Deaminase 

KMLSVCYSDPCLSDFCCGKRPLRIASRNSNLAKAQVHECISLLRSWYPKI>^QLSTTETT 
GDREKKIPLHLVEMSYFFTDGVDALVHKGVCDLAIHSAKDLPETPSLPWAITRCLHPAD 
LLVYADHYVHEPLPLS PRLGSSSLRRSAVLKQLFPQGQ ILDI RGT I EERLDQLH RG HYDA 
rVIJUCAASIJ^I^HAYSIELPPPYHALCXSSIAITAKDHAG^ 

CPn_0053 69350 67986 

sms-Sms Protein 

IRMATKTKTCWTCNQCGATAPKWIX3G<rPGCHNVWSLVEF/YVP^ 

SIELENESRIFIDHAG>©RILGGGVVRGSLTLLGGDPGIGKSTLLI^ 

WCGEESVTQTSLRAKRIJ^ISSPLIYLFPETNIi^IKQQIATLEPDILIIDSIQIIFNPT 

LNSAPGSVAQVREVTYELMQIAKSAQ ITTFI IGHVT KSGE I AG PRVLEHLVDTVLYFEGN 

SHANYRMIRSVKNRFGPTNEIiLILSMHADGI^EVSNPSGLFl^EKTGPTTGSMI 

SGALLI EMALVSSSPFANPVRKTAGFDPNRFSLIXAVLEKRAQVKLFTMrTVF^ 

KI IEPAADLGALLAVASSLYNRLLPNNS I VIGEVGLGGE IRHVAHLERRIKEGKLMGFEG 

AILPEGQISSLPKEIRENFRLQGVKTIKDAIRT.r.r. 

CPn_0054 70089 69313 

rnc-Ribonuc lease III 

TLSFFPP IKIPNSKFKDGALLSMHPP IDITAI EAKLNFTFTQPKLLEI ALTHPSYKNESA 
VQ I EDS ERLEFLGDAVLGL IVTEHLFLLF PSMDEGTLSTARASLVNAKACCRYTTMLG IG 
DYLLIGKGEKIQSERGRLSAYANLFESILGAVYLDGGLSPARKLTVPLLPPREEILPLMS 
GNPKNLLQQFTQ KQF R VLPVYQ ST AVTDACGNVSYQ I QVLVNQEVWG EGN AS SKKEAEK I 
AAC^ALDTYGNKNQNTMDV 

CPn_0055 70096 70590 

CT296 hypothetical protein 

CFWICYLIRrRMRSALHLOHLRHFHNHGS ILFENLLT I KDCFLLETKLQNFI AKASKTID 
T^/RWRENIFRSMPEIYTVVRXRRLDFFAAELVHRPKLSLVRDLWVFPGEEILEGEEDCML 
FLLLSGDRAGSGI FFTGPYPSDLYELEKGTTGLLLAFSSVG I PV I 

CPn_0056 70917 72746 

mrsA-Phosphomannomutase 

E F LKLS LH R I SLMKEV EQR I RS LYDAVTAEN I CRWL SNDCTQQDAKT I LGWLDTD P AQL E 
DLFGATLTFGTC^LRSU^GIGTNRINLFTIRRTTC^LVQVLRAHLPHPGDPMRVVVGCDT 
RHNS I EFAOETAKVLAGNGCEVFLFQY PE PLALVS FTVRYERAIGGVM ITASHNP PNYNG 
YKVYMASGGQVLPPU3QEIVAACSAVNEILSVPSIDHPNIHLIGKEYEALYRDTLKOLQL 
YPEANRISGRSLSISYSPLHGTGISLVPHVLKDWFLSVHLVEKQAIGDGDFPTVQLPNP 
EDPEALTI^IEOMLANDDDLFIATDPDADRVGVVCLEDGQPYRFNGNOMASLLADHILGA 
WSKTRHLGEHDKLVKSLVTTEMLSAIAKHYHVDLIWGTGFKYIGEKIESWRNSTNKFVF 
G A EE S YGC L YGT HVED KDA 1 1 AS AL I AEAALOCKLGCKT LC DALLS L YETYG Y F ANKTE S 
WFSAKTDEQEIRKKLSHLEEISSANFFSGKYQVEKFENYKQGIGFNLLSKDSYALTLPK 
TSMLCYYFSGGGRVI I RPSGTEPKI KFYFEMSTHYPERVTDKEIQKQREAESFQHLDDFI 
FDFKEKFSNL 

CPn_0057 72913 73554 

sodM-Superoxide Dismutase (Mn) 

ILKRYWMSFVPYSLPELPYDYDALEPVISSEIMILHHOKHHQIYINNLNAALKRLDAAE 
TQQNUJELIALEPALRFNGGGHINHSLFWETLAPIDQGGGQPPKHELLSLIERFWGTMDN 
FLKKL I EVAAGVQG SGWAWLG FC PAKQ ELVLQATANQDP LE PLTGKL PLLGVDVWEHAYY 
LQYKNVRMDYLKAFPQI INWGH IENRFSEI ISSK 

CPn_0058 73627 74562 

accD-AcCoA Carboxylase/Transferase Beta 

IRWLVRLFSYDKPKIKVQKIKADGFSGWLKCNHCHEMIHANELGONYNCCPKCSYHYRIT 
AIERVKLLADKDSWRPLYTDLKSQDPLEFIDTDTYANRLEKARKNTTESEGVIVGICTIG 
LH PVALAVMDFNFMAGSMGAWGEKLTP L I EEA I ETRLPVI I VSASGGARMOESVFSLMQ 
MVKTSAALAKLH EAGLPY I SVLTNPTSGGVTAS FAALGD 1 1 1 AEPKAL ICFAGPRWAQV 
IGEDLPEGAOKSEFLLEHGMIDKIVERKELKTTLCTLLDYFLAQEYTGGKSKAPRDLSKR 
LKE I FLLTDDSE 

CPn.OOSO 74S6? 75050 

dur-<j[;TP Nu'.leor idohydroianf> 

IKHHTA;JCNDN I ICN\ [LMTVFCELDrjGGELPE-m-pGAAGADLRANr EEP [ALLPC^QRA 

l r pTf ; : kae i pegyelovr pr^glalfhg r tvlms pct r dsdyrce i rv el r nfc-dutf i 

l EPKMP IAQWL:;pWOATFWKQ£nL^\ETARC:;GCFr;i ITGAS 

^Pn_00^n 75004 7 ( ' J r ).>H 

t>t<;N-PT3 IIA Pi^rem 

PKLPEF/EVLV [ LC0AKMPSYr'0MCQLrrjLFJLL3PRLVMFLGKi l^RDE ILQUl .TP I, V DA 
AGLLEDKQAFFDALVRRENlMfJTG rGWJVA I PHGKLErJCrJMFF IAICI IIITQiJ II-WL^A [DG 
ALVRLVFLrGGPENAO^EYI.KLLrjTLrL^r.RKE^PPWrXOVNTrEEVMNVRVfTM 
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CPn_J)()fi L 7SVH 7620R 

ptnN-PTS [ f A Protein ► HTM DNA-Bmdmg Domain 

RSHEC [ CC I DVK MDLKLDEVAo L LDVG EHTVLQWLK EGA I PS YSMNNEYR F3 R EE I ENWLL 
HNQA LM I Q E RG EDK EAL KDL3 LKY 3 L Y KA I HRGGVLC DWVHS KE EALQY ASKY I AQKFQ 
LDE5VLFEML;:HPENLMGTG IGEG I ALPHAKDFL INAYYDIWPMFLAEP I EYGALDGKP 
VGI LFFLFACQDKGHLNL\/NKIVHLGMSLNARSFFKNYPNKDQL^ 

OPn_0062 76251 77690 

:■'!•:;■!•■ ' T< ; IPC!!! I " "I'MM." 'i' I \7 T . J * "' TM'. E< irAf-V "/ 7PYY IT " 

LVVKKPQKGSERKQAKKEPRARKGYLVPSSRTLSARAUKMKNSSRKE33GGCNEISANST 
PRSWLRRNKRAEQKAAKCGFSAFSNLTLKSLLPKLPSKQKTSIHEREKATSPJ^VNESQL 
SSARKRYCTPSSAAPSLFLETTEIVRAPVERTKEI^DNEIHIPVVQVQTNPKEQOTKTTICQ 
LASQ A S I QQ S EGT EQS LRELAQGAS L PVLVRS NP EVSVQ RQ K E ELL KELVAERRQCKRK S 
VRQALEIARSLTKKVARGGSVTSTIJ^YDPEKAAEIKSRiWCKVSPEAREQKYSSCKRDA^ 
^KQDKTTPSEDASQEEG^TGAGLVRKTPKSQVASNAQNFYRKSKNTNIDSYLTANQYSC 
SSEETDWPCSSCVSKRRTHNS I SVCTMWTVI AMIVGAL I IANATESCTTSDPTPPTPTP 

CPn_0063 78109 78267 

No robust homo log present in Genebank/EMBL as of 11/7/98 
PMYANCKHNCLCLYDFSRHRS PPGLPLTFTPPYSFTLG I FLGRCLSTSNI VLL 

CPn_0064 78340 78576 

No robust homo log present in Genebank/EMBL as of 11/7/98 
LVMT K I QC S AQYYR SR PAERAQT PPQ PFLARDRAD FWERH PRFS ACCRVLLLVAWWLAL 
LFLFVMLLPLAAGSYLLAF 

CPn_0065 78882 80651 

CT288 hypothetical protein 

YDYYKYNMFFKKNYMTDFPTHFKGPKLNP IKVNPNFFERNPKVARVLQITAWLGI I ALL 
SG I VL I IGT PLG A P I S M I LGGC LLASGGALFVGGT I AT I LQARNSYKKA VNQ KKLS EPLM 
ERPELKALDYSLDLKEVWDLHHSWKHLKJCLDI^ 

MIS ENY DACL KMLAYR EELLKEQTQYQETRFNQNLT HRNKVLLS I LSR I TEN I SKAGGVF 
SLKFSTLSSRMSRIHTTTTVILALSAWSVMVVAALI PGGILALPILLAVAI SAGVIVTG 
LSYLVRQILSNTKRNRQDFTKDFVKNVDIKU .NQTVTIORFIJEMIJCGVLKEEEEVSLEG 
QDWYTQYITNAPrEKRLIEEIRVTYKEIDACTKKMKTDLEFLENEV^ 
SETPIFTQGKEFAKIi^G/rSQNrSTIYGPDNENIDPEFSLPWMPKKEEErDHSLEPVTKL 
EPGSREELLLVEGVNPTLRELNMRIAUjQCXJLSSVRKWRHPRGEKYGNVIYSDTELDRIQ 
MLEGAFYNHLREAQEEITQSLGDLVD IQNRILGI IVEGDSDSRTEEEPQE 

QPrt_0066 80916 82655 

NGF r 'robust homo log present in Genebank/EMBL as of 11/7/98 
G^MANPTQSRPPSPEISIEELEt^EI^SSNT^ 

N'S;EDEEG PLGSCEVYDWC ITNQGD PEVRDHEVRVMY I NGSGRTQHEG I LDAMNICDLRG 
Efi^RFI HNSGYGLGSCFLG IRNRIPPRDNVI SQAIQARWNEFFIFAENANRDYIVLFSGN 
GpLYLQVALDNS I YSHHILCVG I G 3 S YY I QGNYRVHNYRVTG CWTT LLDRRG ATAVNTTT 
I«^SAEGLFLPSVRCPSYQWAIJ*CGE^ 

STOTRLIEWIDRGDSQAVLEIJ^PQPSHCRDIALTALYATTRISSIJ^ 

F^EYA I VTG YS I MTLRYF I LLLTNR PGC RRH F RVIJILAAIjGLQS LG FLTVLLDH I NVT RR 

Vj^RR P P L I SV I FCTAS FATGS F I YVDLTRMF FT S IJ^S RLQLFVQRRLTG RGL PLRRVFVN 

tCoSLRFSQNAL ITFHGGLFMPLI IGFFNQLVIQVPRWIRPirTTAVYDLNQTSQEAWDS 

GP^IGQTINFLLCMILLVIOTFFFVRSVRRNLHRRPHR 

C|fil0067 82920 84053 

Ncr robust homolog present in Genebank/EMBL as of 11/7/98 
KGSGYSYRGPPMAVEGRVNSSQAIJ^QIXZQEVIJU^QSKGIJ^CRrLSIWAVITFIAGVV 
LI ALTLAS I LTS VPYLALGVFLLIVTLGC 1 1 FALCSEKrKKVPPTP I SHKEE 1 1 AWFEER 
KB=IBMEKEKEDPEHFGRTATDI PMRSALDQFNHSCHH IHESPALTETYRSHQDVLLFKDW 
CpVTLPDVTSEEEVL IRSWGSYLLMEACVPKVSML I DELHNKLKS PSERECLFIDKKTL 
Qi^^CASFLFTQKDLATFFI^YTRV^^X3H^APFRAGAKWILIHYVRIJRR 
CYYARLAFNQTQRLYHQLFNVEKLRS IYANMDKDPLCHPWAF I PIYDLLKTEDHGDGFLE 
Q^Et>REYPSRAAQDQFWG 

CPf?10068 84909 84331 

CT^60 hypothetical protein 

SFKIKKFFIYSLIFSCSFSAPLKGICNEDVSSQSRIEEDPEVLITQLNELIETPIEEGKE 
IRjf£LQAISDGQKSSEEIEESCGTSDSEX2LSEKTDKESSNEYVLDFFDSMVQRLEGrSKM 
CQSGQVAQ I IDC FNREFD I RNRELELKNRELELREKDLEFKKS ILDWNKEKVS RELAFQR 
EQDIKQTLMLLKK 

CPn_0069 85191 87086 

No robust homolog present m Genebank/EMBL as of 11/7/98 
I^FLYWLLIFNI^IMTTPPPSRSSSPPPYWIEI^DLGm'NNNSSRATPPPPEVGGELP 
PYFSASNFWIERGAPSLPSPQQLLSLPEYSRQPPPGYFDETASITSRTSEEMFGTLVST 
LCC PANS ERDWEDHEVNC I Y I ASTSDTQLEAVQGGMH ITELRGEPVRVLYETGHLYAFAR 
EOTCHSRLEVSHTVRAMTYFWDRFFSRHWNVGRRFLVFYQGNGGAYVQAALDSSMHTQDI 
YVLGLS PTVY I RGNYHVQH YRVRGFWPSCLDSLAACAENTSVLPYGESSDG I FYPSLFSH 
TFDNAI RYGERCLLVCSEGMGMLPETQQQTSPLTSLEGGHEVALVLNPQQNPEALS IASR 
LM H E ERGG R L ESNYM PGRS SN P FMTSMYVLVR LNT LAQ I Y LMS PYYS FQ SND I VCL I F I S 
SAAVETVSY I FLTVTDSTCGRRYLRVPRLVCTGLRNLALPTTLLELLILSYPRSVEGVPF 
NVRFILGYMCTTRWFFAWNLILHWPFRCLRHGIQLFVHRS I IGHTLGARITDLTLASMR 
YAIVFP3 TV3SC LLTALAHANTNILALDPYRL IESGDLRRPAFKDDEMQQADNPWDAYS I 
GLV INTC T YML r LFANLIFMVYSVRRYHRSRR 

CPn_007() 87399 87208 

Ho- robust homolog present m Genebank/EMBL as ot 11/7/98 

YKVGLFHLKNONFFSNOSRTYEQRFPKVSPHFESILPLOSVGFSSQGTLLISFRDTELKR 

DLY E 

OPn_0fJ7l 880*>6 87599 

r "T i2 c j hypothetical procein 

I Kf;LR. r j LLEF ECPLQHARCLKKQHKI rEELFPEPFQKDHLYLKLMENSSSRDAFDKKRML 
KKNr.WfJCO.'IDLYLYEVYQDGILFFFTYTKALMSOJIASLFTEVYSGETPSTtLTr'KPrF 

1-tjnur i' yl: : rr ;r lnu ;eulymrmkq i avqylk ppqt 

*'t'n_fJ()7:i H'»lbL Hfl()57 

'."['KM hy pother km L protein 

i<GYR. f JTKT:;VYKEKVL I L E YCLLFYFFHYRMijTPL" jOG ICP. ( JDQYVPQELFCDRL33SR 
:;N:;PU.';MA:;f;D:;P[V';PPI;;ALVALTDLKLVPYNOM^FnWTTRLKNAVEKrGLFLQRNWK 
Y r t,LY 1 1 .AWAL [ LVCI1MTVALTLT IWLGVGLG ICWFG tFTATCLDKENKHRHVNSLWNL 
LNH( J f Lyi 1 [>PW;TRQ [ LLATMIAH I SAL E YAV PQAVGLV EGFS tCNQL^ INTVYGARLGD 



4 

EATYAIDPKAHKKPtENrEOAINOKO I IK iff, M : VQKQLMAL E EINRNNQTDPATANLLAS 
LKLNLJ4QPMPYCF"MPECOVT33YLDLNNN'>FDD I IARADQC rMTLoCTLQC IKKEPDRI 
IESNH 

CPn_0073 89353 39574 

infA- Initiation Factor IF-L 

SMAKKEDTLVLEGIC/EELLPGMHFRVT LENCMPVTAHLCGKMRMSN I RLLVGDRVTVEMS 
AYDLTKARWYRHR 

i •□_<.•■ ■ ; if 'm/^- 

tutA-hiongat ion factor Tu 

EDFEMSKETFQRNKPH INICT IGHVDHGKTTLTAAITRALSGDGLASFRDYSS I DNTPEE 
KARG IT INAS HVEYET PNRHYAHVDC PGHADYVKNM I TGAAQMDGAI LWSATDGAMPQT 
KEHILLARQVGVPY IWFLNKVDMI SQEDAELI DLVEMELSELLEEKGYKGCPI IRGSAL 
KALEGDANY IEKVRELMQAVDDNI PTPEREI DKPFLMPI EDVFS I SGRGTWTGRI ERG I 
VKVS DKVQLVG LG ETK ET IVTGVEMFRKELPEGRAGENVGLLLRG IGKNDVERGMVVCQP 
NSVK PHTKF KS AVYVLQKEEGG RHKP F F S G YR PQF F F RTT DVTGWTL P EGT EMVM PGDN 
VE LDVE L IGTVAL EEGMRFA I REGG RT IGAGTISKINA 

CPn_0075 91087 91350 

secE-preprotem trans locase 

SRSWFMKQQHNRKALSRKIGTVKKQAKFAGSFLDEI KKI EWVSKHDLKKY IKWL IS I FG 
FGFAIYFVDLVLRKS ITCLDG ITTFLFG 

CPn_0076 91334 91903 

nusG-Transcr lpt lonal Ant itermmat ion 
QPrcSVNOfYKWYWQVFTAQEKKVTCKALEDFKESSG 

KWEKY IWPGYLLVKMHLT DESWLYVKST AG IVEFLGGGVPVALSEDEVRSI LTDI EEKK 
SGVVQKHQFEVGSRVKINDGVF\/NFIGMVSEWHDKGRLSVMVSIFGRETRVDDLEFV^V 
EEVAPGQESE 

CPn_0077 91956 92435 

rlll-Lll Ribosomal Protein 

FFVSYPLFVEVSC^KVRFSMSVKKVIKIIKLQrPGGKANPAPPIGPALGAAGVNIMGFCK 

EFNAATQDKPGDI^PWITWADKTFTFITKQPPVSSLI 

TQAQVEAI AEQKMKDMDIVLLES AKRMVEGTARSMGI DVE 

CPn_0078 92453 93160 

rll-Ll Ribosomal Protein 

SCRIMTKHGKRIRGILKNYDFSKSYSLREAID ILKQCPPVRFDQTVDVS I KLGIDPKKSD 
02 1 RGAVFL PNGTGKT LR I LVFASGNK^K EA VEAG AD FMGS DDLVEK I KSGWLE FDVAVA 
TPDMMREVGKLGKVLG PRNLMPTPKTGTVTTDVAKA I SELRKGK I EFKADRAGVCNVGVG 
KLSFESSQI KENIEALSSALIBCAKPPAAKGQYLVSFTISSTMGPG I S IDTRELMAS 

CPn_0079 93170 93688 

rllO-LlO Ribosomal Protein 

RGKMKQEKTIJLI^EVEDKISAAO^FILLRYIJIFTAAYSREFRNSLSGVSAEFEVL^ 
FKAIE^GLEVDCSDTDGHLGVVFSCGDPVSAAKQVLDFNKQHKDSLVFLAGRMDNASLS 
G AEVEAVAKL P S LKEL ROQ WGLFAA PMS Q WG IMNSVL S GV I SCVDQKAGKN 

CPn_0080 93720 94121 

rl7-L7/L12 Ribosomal Protein 

VRVTKVTTES LET LVEKLSNLTVL EL SQL KKLL EEKWDVTAS AP WAVAAGGGG EAPVAA 

EPTEFAVTLEDVPADKKIGVLiCVVREVTGLALKEAKEMTEGLPKTVK 

KLQDAGAKASFKGL 

CPn_0081 94219 98016 

rpoB-RNA Polymerase Beta 

FREILSHQNSRRTRMLKCPERVSVKKKEDI PDLPNLI EIQ IKSYKQFLQ IGKLAEERENI 
GLEEVFREI FP IKSYNEATVLEYLSYNLGVPKYSPEECI RRG ITYSVTLKVRFRLTDETG 
IKEEEVYMGTIPLMTDKGTFI INGAERVWSQVHRSPGINFEQEKHSKGNILFSFRI IPY 
RGSWLEAIFDINDLIYIHIDRKKRRRKILAITFIRALGYSSDADIIEEFFTIGESSLRSE 
KDFALLVGRILADNI I DEAS S L VYG KAG EKLS TAMLKRM LDAG I A S VK I A VDADENH P 1 1 
KMLAKDFTDSYEAAIJCDFYRJ^LRPGEPATLANARSTIMRLFFDPKKYNLGRVGRYKIjNRK 
LG FS I DDEALSQVTLRKEDV IG ALKYL I RLKMGDEKACVDD I DHLANRRVRSVGEL IQNQ 
CRSGLARMEK I VRERMNLFDFS SDTLT PGKWS AKGLAS VLK DFFG RSQLSQ FMDQTNPV 
AELTHKRRLSALGPGGLNRERAGFEVRDVHASHYGR ICP I ETPEGPNIGLITSLSSFAKI 
NEFGFIETPYRIVRDG IVTDEI EYMTADVEEECVIAQASASLDEYNMFTEPVCWVRYAGE 
AFEADTSTVTHMDV5 PKQLVS IVTGLI PFLEHDDANRALMGSNMQRQAVPLLKTEAPWG 
TGLECRAAKDSGAIWAEEDGVVDFVDGYKVWAAKHNPTIKRTYHLKKFLRSNSGTCIN 
QQ PLCAVGDVITKG DV I ADGPAT DRG ELALGK>A/LVAFMPWYGYNFEDA 1 1 ISEKLIRED 
AYTS I Y IEEFELTARDTKLGKEE ITRD I PNVSDEVLANLGEDGI I R IGAEVKPGDILVGK 
ITPKSETELAPEERLLRAIFGEKAADVKDASLTVPPGTEGVVMDVKVFSRKDRLSKSDDE 
LVEEAVHLKDLQKGYKNQVATLKTEYREKLGALLLNEKAPAA 1 1 HRRTAEIWHEGLLFD 
QET I ER I EQEDLVDLLMPNCEMYEVLKGLLSDYETALQRLEINYKTEVEH I REGDADLDH 
GV I RQVKVYVAS K RKLQVG DKMAGRHGNKG WSK IVPEADMPYLSNGETVQMI LNPLGVP 
SRMNLGOVLETHLGYAAKTAG I YVKTPVFEGFPEQR IWDMM I EQGLPEDGKSFLYDGKTG 
ERFDNKWTGY IYMLKLSHLIADKr HARS IGPYSLVTQQPLGGKAQMGGORFGEMEVWAL 
EAYGVAHMLOE I LTVKSDDVSGRTR I YES IVKGENLLRSGTPESFNVLIKEMQGLGLDVR 
PMWDA 

CPn_0082 97992 102221 

rpoC-RNA Polymerase Beta* 

CSSYGRRRLKNDVLEKIMFGENSRDTGVLSKEGLFDKLEIGI ASDITIRDKWSCGEIKKP 
ETrNYRTFKPEKGGLFCE^rrcPTKDWECCCCKYKKIKHKGIVCDRCGVEVTLSKVRRER 
MAH I ELAVP IVH I WFFKTT PSR IGNVLGMTA3DLERVIYYEEYW I DPGKTDLTKKQLLN 
DAQYREWEI^KDAFVAKMGGEAIYDLLKSEDLQSLLKDLKERLRKTKSOQARMKLAKR 
LKI IEGFVSSSNHPEWMVLKNTPWPPDLRPLVPLDGGRFATSDLNDLYRRVINRNNRLK 
AILRLKTPEVIVRNEKRMLOEAVDALFDNGRHGHPVMGAGNRPLKSLSEMLKGKNGRFRQ 
NLLGKPVDY^GRSVTIVGPELKFNOCCLPKEMALELFEPFirKRLKDOG^VYTrRSAKKM 
[QRGAPEVWDVLEEI I KGHPVr^LNPAPTLHPL/; TQAFEPVL t EOKA [ R IHPLVCAAFNAD 
FDG DQM AV> 1 V P LSV FAQ LEAKV LMMA PDN I FLP JO K PVA I PS KDM T t / 1 L YY LMADPTY F 
PFrHC^KTK r FKDE IEVLRALNNGCF [ nDVF^;r>PRDE1Y;RG Itf IHEK IKVR EDOQI I ETT 
PClRVI.rMR rVPKELGFQhTY.SMP-KP ['.ir.LlU^: i'KKVn[ 1 EATVRFr.DDLKDLGr[QATKA 
At.jNCU.KDVR [PDIK:;H TLKDAYLKVA fVKKO /DDG [ rTE(jERH:*KT 1:1 [VTTKVSEQLSD 
AL.YVC I SKQTRSKHNPLFLM I D'.'SW JNK iK^L.KQI XI ALRGLMAK PN( IAI [F'lp [TflNFRE 
GLTVLEYfj I.SCHCARKGLADTALKTAD:;( IYI .Tf-PLVDVAQDV I ETEKIX^GTLNH EE TOA I 
CQf EELLPLKDR I YGRTVAKDVfOF IDK: :ui ,l .AQSGDVI JJH VQAEA E DPAG [ FT I K E R3 

tltc'c;e j rgvcakc\gl^lani suli^m ;i :av^; ; i aaq:; ioepc ;toe*tmrtfh [/ i aatg 

•JTI'R E ETN^DG [ LVTMDLRWU iQF/;! INLVU \YY< JAI MVV( JDFGUTE.NK'i KK LI ,1'TKrj E E 
SLKVE-PVCr^VK [LVADGTPV- V,Of> l< ;i;vr- IAU\ I P f rGOKPGF EKYHnLVP.! ; [:rpEKW 
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NKNTGLVEL I VKQHRGELH PQ I A lYDDADLSELVGTYAI P5GA I ISVEEGQRVDPGMLLA 
RLPRGA IKTKD ITGGLPRVAELVEAPKPEDAAD IAK r DGWDFKG IQKNKRILWCDEMT 
GM EEEHLIPLTKHLI VQRG DSV I KGOO LT DGL WPH E I LE I CGVRELQKYLVKEVQEVYR 
LQCVDINDKH E El IVRQMLQKVR ITDPGDTTLLFGEDVNKKEFYEENRRTEEDGGKPAOA 
VPVLLG rTKAraUTTESFIGAAGFQIDTTRVLTDAACCSKTDYLLGFKENVIMGHMIPGGTG 
FETHKRIKQYLEKEQEDLVFDFVSETECVC 

CPn_OOR3 102296 103312 

: -i.xiM.N.-AKVf m twr^i.ri"* :t*[v r :; r *\ . , "t; p'; i , e lkvai. r ?k f 

OELl^EA\AA^IRQNGDDI^LbTILDKIQVNFALEIIKNIPGRI3LEIDAJlLi;FNVEA« 
VQ RAVFLSQLFEAMGGDKKRLL VK I PGTWEG I RAVEFLEAKG IACNVTL I FNLVQAIAAA 
KAKATL ISPFVGRI YDWWIAAYGDEGYSIDADPGVASVSNIYAYYKKFG I PTQIMAASFR 
TKEQVLALAGCDLLTI S PKLLDELKKSQH PVKKELDPAEAXKLDVQPI ELTESFFRFLMN 
EDAMAT EKLAEG I R I FAGDTQ IL ETAI TEF I KQ I AAEGA 

CPn_0084 103356 103751 

predicted ferredoxin 

SE^KNKMDYKSOLVFSCPCCCKGNVCFSVFNLDVrLTCNVCSSTYTFDSVIRNEIRQFVA 
LCKJ^IHDANSILC^ATVSVSVEDNQMDIPFQIXFSRFPVVLNLSLDGKKIAIRFLFDALN 
TS I LHQESDLIS 

CPn_0085 104512 103766 

CT311 hypothetical protein 

FSMKFFILFILIVAQFPAFSAQPRTQVSASHSKQAKARRTSRIRSSAATNASVSRYKTRA 
AARKKIGKFEKKPSLS PVQWVRYSGKNYS IQT PSLWQC I DDKTQLPEKLDVLLIGKGKGN 
LTPT IN IAQEITSKSSKEYIEEILAYHKANEMTLESGIFTQrQSPSGEFTI IKTEKNSSW 
GRVFCLQATTVIDHTAYIFTSTATUDDYAELSFTFLKWSSFQIRGGKEATSGDAILEKA 
LEALQNENK 

CPn_0086 104898 105527 

atpE-ATP Synthase Subunit E 

NIMANUJAIX3KLKQICDALRIJDTLKPAEDEAAWXHNAKEQAKRI I QEAQEEARKI LETA 
EE RAH QKIKQG EVAL SQAGKRALEAL XQ A VENK I FR ES LVEWLEHVTTD P EVSTKL I OAL 
VQALEAQGVSGNLTAY IGKHVS PRAVNELLGKAVTTKLRKXSVWGSFVGGVQLKVEEKN 
WVLDLS S SALLE I FTRYLQKDFREM I FQG S 

CPn_0087 105540 106376 

CT309 hypothetical protein 

SH'SK I F S I FKVWMTQYYFLS S FLPTQLPESVPLFS I SDLDDLLYLNLS ENDLCNYGLLK 

RlF]PbFENFAFF^AGKPIPFSFGEVTQENVERM^^ 

Rifi|HFSDr^REFLSYHG^ 

DES^EVLMQKDSPNYELPEEFSDLQGVLDITCGL^ 
r^WILARCATYMFArRNSLASVEKGREI INHIEKAIKW 

ci?y_0088 106352 108145 

hypothetical protein 
S YRKGNQMVTVS EQTAQGHV I EAYGNLL R VR F CGYVRQG EVAYVNVDNTWLKAEVT EVAD 
QE^^QVFEOT^ACRGALVTFSGHLLEAELGPGU^IFroi^NRLEVI^ 

1 2 DHNLWNYT P VASVGDTLRRG DL LGTVPEG RFT HK I MVPF S C FQ EVTLTWV I S E 
GfYflAHTWAKARDAQGKECAFTMVQRWP I KQ AF I EG EK IPAHK IMDVGLRI LDTQ I PVL 
KGGTFCTPGPFGAGKTVI£HHLSKYAAVDrA7IL^^ 

SipSRTCIICm'SSMPVAARESSIYI/^IAEYYRQl^LDILLI^STSRWAQALREISG 
RL EE I PGEEAFPAYLS SR I AAFYERGGAI TTKDGSEGSLTI CGAVS PAGGNFEEPVTQST 
LAWGAFCG LSKARADARRY PS I DPL I SWS KYLNQVGQ I LEEKVSGWGGAVKKAAQFLEK 
G^EIGKRMEWGEEGVSMEDMEIYIJCAELYDFCYLQQNAF^PVDCYCPFER 
RIFDAKFVFDS PDDARSFFLELQSKI KTLNGLKFLS EEYHESKEVIVRLLEKTMVQMA 

CP1I0089 108111 109466 

CT#S9 hypothetical protein 

LDCWKKQWYKWRKDMQT I YTKI TDI KGNLITVEAEGARLGELATITRSDGRSSYASVLRF 
DBKiPTLQVFGGTSGLSTGDHVTFUJRPMEVTFGSSLIiGRRI^GIGKPIDNEGECFGEPI 
E f ATPTFNPVCR IVPRSMVRTNI PMI DVFNCLVKSQKI P I FS SSGEHHNALLMR IAAQTD 
^OT^ * GGMGLT FVDY S F FVE ES KKLG F ADKC VMF I H KA VDA PVECVL V PDMALAC AE K F 
AVEEKKNVLVL LT DMT AF ADALKE I S ITMDQ I PANRGY PGSLYS DLALRYEKAVE I ADGG 
S^pITVTTMPSDDITHPVPDNTGYITEGQFYLRNNRIDPFGSLSRLKQLVIGKVTREDH 
GDIJ^ALIRLYADSRKATERMA>KFKLSNWDKKIJLAFSELFETRU1SLEVNIPLEEALDI 
GWKILAQSFTSEEVGIKAQLINKYWPKACLSK 

CPn_0090 109439 110080 

atpD-ATP Synthase Subunit D 

VLAKSM S VQ VKL TKNSFRLEKQK LARLQT YL PTLKL KKA LLQ AEVQ NA VKDAAEC D KDYV 
QAYERI YAFAELFSI PLCTDCVEKSFEIQS I DNDFENIAGVEVP rVREVTLFPASYSLLG 
TP IWLDTMLSASKELWKKVMAEVSKERLKILEEELRAVS IRVNLFEKKLI PETTKILKK 
I AVFLS DRS I TDVGQVKMAKKKI ELRKARGDECV 

CPn_0091 110074 112053 

atpI-ATP Synthase Subunit I 

VRLNIHKYLFIGRNKADFFSASRELGWEFISKKCFITTEQGHRFVECLK^/FDHLEAEYS 
LEALEFVKDESVSVEDIVSEVLTLWKEIKGLLETVKALRKEIVRVKPLGAFSSSEIAELS 
RKTGISLRFFYRTHKDNEDLEEDSPNVFYLSTAYNFDYYLVLGWDLPRDRYTEIEAPRS 
VNELQVDLANLQREIRNRSDRLCDLYAYRREVLRGLCNYDNEQRLHQAKECCEDLFDGKV 
FAVAGWVrVDRI KELQSLCNRYOIYMERVPVDPDET I PTYLENKGVGMMGEDLVQIYDTP 
AYSDKDPSTWVFFAFVLFFSMIVNDAGYGLLFLMSSLLFSWKFRRKMKFSKHLSRMLKMT 
A r LGLGC ICWGTTTTS FFGMS FSKTSVFREYSMTHVLALKKAEYYLQMR PKAYKELTNEY 
PSLKAIRDPKAFLLATEIGSAGIESRYWYDKFIDNILMELALFIGWHLSLGMLRYLRY 
RYSO TGWI LFMVS AYLYVP I YLGTVSL I H YLFHVPYELGGQ IG YYGMFGG I G LAW LAM I 
OR.^WRGVEEI ISV7QVFSDVLSYLR IYALGLAGAMMGATFNQMGARLPMLLG3 IV I LLGH 
SVH [ IL5TMGGVIHGLRLNFIEWYHYSFDGGGRPLRPLRK IVCSEDAEACG IHLDNNSIV 

(•A>nJ)C)'t2 112121 1L2573 

■ ir.pK -ATP Synthase Subumr K 

R YLKGAf!KV:JM tDMljWGPALVLGLAM I GiJ A I GCt IMAGVA;JHAVM:";R EDEOHOKL IGMCA 
Mpy.ljO'.'' I YG [■' [ f.Mt.LMOAA I KNGTL3 PVGG [ A I G L w VG AA L L V 1 ' : ' VM OG K re V'^G [QAYA 

h:;:;:, ! y» ikcyaa n nvKr.rsLFAWFALLLL 

'-•Iti_00'.« ( UJ440 I L JO 15 

tyriUi hiypt »r tn.-r. l < . * L protein 

':KA:;Wf;AKFKr^LDLRyYMG:;V^RLCL:^ILFHOLLLFt.RYYY':KLVFf;LTVLIJ\ArSV 
r^7,f.^G:;KP:;L;;:;FTEWGPEYJAAA0L:nE0SGHDCVY(X'0WV r rW^LP r ;RMRKCLPVT 
r.YI W/YGNGKVEKLTYCVNOSAGYRVYCLKGLCYKCLCx' I I 'A KVALCSGNQHTVSPRH 



hl,wmevi.;ld3p 

CPn_0n-i4 113 104 livni 

v-tlS-Volyl tRNA Synthei-.isp 

WRVFLSRDHKFCLRTWTTEDFPKAYNFgiTrEPELr/FWEKNGMFKAEASSDKPPYSVIM 
PPPNVTCTya.i^HALVNTLODVLVRYKRMrTGFF/vCW r PGTDHAC EATQAVVERHLQASEG 
KRRTDYSREDFLKH rWAWKEKS EKVYLSQLPQ LGT "fcwDRK R FTME PLANR AVKKAFKT 
LFENGY TYRG'ri'LVNWDPVLCTALADDEVrri'EEKD' 3WLYY I R YRMVG SQ ES TWATTRPE 

> < *,M' ■"> 'MTFi:;,rr ' \* . - " . 1 ' \. - !:^< "VPKrr- -': ~ 

Vt JVJ Y RSGA V IEPY L3 K<*W p;^ , ^ ~ ^ A ^ L l~\ L , >j U I K I : t K LVV HLN \ L JWVN HLRLW 
GISRCLWVsGHRIPVWYHraJDDERVLCYIXIEGEPEEVACDFDSWYQDPDVLDTWFSSGLWP 
LTC LGW PDENS P DLKK FY PTALLVTG K D I LFFWVTRMVLLC S SMSG EK P F S EVFLHGL I F 
GKSYKRYNDFGEWSYISGKEKLAYDMGEALPDGWAKWEKLSKSKGNVIDPLEMIATYGT 
DAVRLTLCSCANRGEQ I DLDYRLFEEYKHFANKVWNGARF I FGH I SDLQGKDLLAG IDED 
SLGLEDFY I LDGFNQL I HQLEEAYAT YAFDKVATLAYEFFRNDLCSTYI E 1 1 KPTLFGKQ 
GNEASQSTKRTLIAVLLINVXGVLHPVAPFITESLFLRIQOTLGALPEGDGDAFTGHALR 
MLRSRACMEAPYPKAFDVK I PQDLRES FTLAQRLVYT I RN IRGEMQLDPRLHLKAFWCS 
JTTTEIQSCIPIt^AI^I.ESIOLLDKEPEKGLYSFGVa'DTIRLG I FVPEEHLLKEKGRLE 
KERVRLERAVENLERLLGDES FCQKANPNLWAKQ EALKNNR I ELQG I L DKLASFA 

CPn_0095 115956 118790 

pknD-S/T Protein Kinase 

AC I VCLDREDQRSLERYDI VRI IGKGGMGEVYLAYDPVC SRKVALKKI REDLAENPLLKR 
RFIJ^EARIAADLIHPGWPVYTIYSEKDPVYYTMPYIEGYTLKTLLKSVWOKESLSKELA 
EKTSVGAFLS IFHKICCTIEYVHSRGILHRDLKPDNI LLGLFSEAVILDWGAAVACGEEE 
DLLDIDVSKEEVLSSRMTI PGRIVGTPDYMAPERLLGHPASKSTDI YALGWLYQMLTLS 
FPYRRKKGKKIVLDGQRI PS PQEVAPYREI PPFLSAWMRMLAVDPQERYSS VTELKED I 
ESHLKGSPKWTLTTALPPKKSSSWKI^EPIIXSKYFPMLEVSPASVrr'SI^ISNIESFSEM 
RLEYTLSKKGI^EGFGIIXPTSEKALGGDFYQGYGFWLHIKERTLSVSLVKNSLEIQRCS 
QDLESDKETFLIALEQHNHSLSLFVIX?TTWLIHM^LPSRSGRVAIIV^DMEDILEDIGI 
F E S SG S LRV S CLA VPD AFLAEKLYDRALVL YRR I AES F PGRK EGYEARFRAG I TVL EKAS 
TDNNEQEFAIAIEEFSKUiDGVAAPLEYI^KALVYQRI^ 

I FRLKDHWYRLH ES FYKRDRLALVFMI L VLE I APQAIT PGQ EEK I LVWLKDKSRATLFC 

I^DPTVLE^SSKMFXFLSYWSGFIPHr^SLFHRAWrXJSDVPJULrEIFW 

SSC IDI FKESLEDQKATEEIVEFSFEDLGAFLFAIQS IFNKEDAEK I FVSNDQLSP ILLV 

YIFDLFANRAIiLESC^EAIFQAI^LIRSKVPENFYHDYLRNHEIRAHLWCRNEKALSTIF 

ENYT EKQLKDEQ H EL FVLYGCYLAL I QG AEAAKQ H F DVC RED R I FPASLLARNYNRLGLP 

KDALSYQERRT.T.T.RQKFLYFHCLGNHDE^LCQTMYHLLTEEFQL 

GPn_0096 124347 118837 

CT296 hypothetical protein 

ET FLS ILRE FFMKSLPVYVSG I KVRNLKNVS I HFNS EE I VLLTGVSGSGK5 S I AFDTLY A 

AGRKRYI STLPTFFATT ITTLPNPKVEEIHGLS PT I AI KQMHFSHYSHATVGSTTELFSH 

LAIiFTLEGQARDPKTKEVLDLYS K EKVLST IMELS EGVQ I S I LAPLLRKD I AA I HEYAQ 

QGFTKVRCNGT I HP I YS FLTSG I PEDCSVD IV I DTL I KS ENN I ARLKVSLFTALEFGEGH 

CSVLSDEELffTFSTKQQIDEfVTYTPLTQQLFSPHALiESRCSLCQGSGIFISIDNPLLIDE 

NL S I KENCC SFAGNCS SYLYHT IYOALADALNFNL ETPWKDLS PE I QN I FLRGKKNLVLP 

VR L F DQTLG KKNLTYKVWRGVLND I G DKVRYTT K P S RYL S KGMSAH SCSUZ KGTGLGDY A 

SVATWEGKTFTEFQQMSLNNWHVFFSKVKSPSLSIQEILC<3LK0RLSFLIDLGI^ 

RALATLSGGEQERTAIAKHIXXJELFGITYILDEPSIGLHPODTEKLIGVIKKLRI>^ 

I LVEHEERM ISLADRI ID IG PGAG I FGGEVLF^KP EDFLMNSSSLTAKYLRQELT I P I P 

ESREAPTSWIiLiTEATIHNLKNLSIRLPLARLI^ 

QENPKNLHFEW3CIGRLIH ITRDLPGRSQRSI PLTY IKAFDD IRELFASOPRSLRQGLTK 

AH FSFNQPQGAC rCCQGLGTm' I SDDUT P I PCSECQGKRYH S EVLE I LY EGKNI ADILDM 

TAYEAEKFF I SH PKI HEK I HALC SLRLDYLPLGRPLSTL SGGE IQRLKLAHELLFASPKQ 

TLYVLDEPTTGLHTHDIQALIEVT^LSLTYLGHTVLVIEHNMH^AT^ 

GGYLLASCT PKDLIQUTTPTAKALAPY I EGSLDI PWKS EPPSS PKSCDILI KDAYQNNL 

KH I DLALPRNS L IAI AG PGASG KH S L VFD I LYASGN I AYAELF PPY I RQGLLKETP LPS V 

GETVKGLSPVISVRKCSSShmSYHTIASAI^LS^LEKLFAILGEPFSPLTEEKLSKTTPQ 

TI IDSLLKSYKDDYVTITS PIPLGSDLEI FLQEKQKEGF IKLYSEGNLYDLDERLPLNLI 

EPAIVIQHTKVSPKNSSSLLSAISVAFSLSSEIWIYISOKKORKLSYSLGWKDKKGRLYP 

EITHQLLSSDHPEGRCLTCGGRGEILKISLEEHKEKIAHYTPLEFFSLFFPKSYMKPVQK 

LLKDENASQPLKLLTTKEFLNFCRGS SEF PGMNALLMEQLDTESDS PLI KPLLALTSCPA 

C KG SGLNDYANYVR INNT S LLD I YQEDATFLE SFLNT I GTDDTRS I IQDLMNRLTF ISKV 

GLSYITLGQROiyrLSLXSENYRLHLAKKISINLTNIVYLFEEPLSGLHPQDLPTIVQLLKE 

LVANNNTVI ATDRSCSLI PHADHAI FLGPGSG PCGGFLMDSDTEVC PSVDLHANVPQTEV 

CPKAPLS I SKANHTRGSDRTLKVNLS IHH IQNLKVSAPLHALVAIGGVSGSGKTSLLLEG 

FKKQAELL I AKGTTTF S DL WI DSH P I AS SQR SD I STYF D I A PS LRAFYAS LTQAKALN I 

SSTMFSTNTKCGQCSDCQGLGYQWIDRAFYALEKRPCPTCSGFRIQPLAQEVLYEGKHFG 

ELLHTPIETVALRFPFIKKIQKPLKALLDIGLGYLPIGQKLSSLSVSEKTALKTAYFLYQ 

TPETPTLFLIDELFSSLDPIKKQHLPEKLRSLINSGHSVIYIDHDVKLLKSADYLIEIGP 

GSGKQGGKLLFSGSPKDIYASKDSLLKKYICNEELDS 

CPn_0Q97 124549 126006 

pyk- Pyruvate Kinase 

DS M I TRTK 1 1 CT I G PATNS P EMLAK LLDAGMNVAR LN FS HGS H ETHGQ A IGFLKELREQK 
RVPLAIMIjDTKGPEIRLGNIPQPISVSQGQKLRLVSSDIDGSAEGCVSLYPKGIFPFVPE 
G A DVL I DDG Y I H A WVS S EADS L EL EFMN SG LLKSHKSLSI RGVDVAL P FMT EK D I ADLK 
FGVEQNMDWAASFVRYGEDIETMRKCLADLGNPKMPI IAKIENRLGVENFSKIAKLADG 

ihiargdlgielswevpnlokmmakvsretghfcvtatomlesmirnvlptraevsdia 
na i ydgss avmlsget asg ah pvaavf i mr s v i let eknlsh ds flklddsns alqvs p y 
lsaiglagiqiaeradakalivytcgg:jspmflskyrpkfpiiavtpstsvyyrlalewg 
vypmltqesdravwrhoac iyg teogilgnydr ilvlsrgacmeetnnltlt [vndiltg 

SEFPET 

CPn_0098 1274^4 l. J hO-fi 

No robust homo log ptesenc in c Vm-bnnk/ HMBL <v., nt 1L/7/9S 
LVOKKFHQIKRT ILEAPLYYLV^G I L/.U "Rf ITf'PSFT .T r JLy IKGFGFl^FY T T.'JDYRKTAL 

tnlalafpektfderyk r -\rojloml r :tllc.u.a i colvgn tdkl rr r vtt.^rhpkgfs 

SKFVrr:rJEDLECTFKNLOr:KO(]L[LF f 'iHOANWf :f,PFL/ TTKNYKI T AFAKATKNORL'-'K 

k : rALPEVFKGK rvppKNG [ v\>G [ ea i yji )Ki :jt , r vi ir/^ALLM' ;ytyplfg:; PAFTTTS 

I'AI.LA/KTGFrviAVNV^ROAKGFrv: i-'.AKl VANK: ILf'MKi :VA f LMlJ(jMMf JFLEKG I A 
■;OlTOWMWIHKRWKPKE:'NVIKKKYP/ 'Uri.VCVIWVJMF.'IKI.FAIAErFSfrrTLHLAL 
(3NAUHLEELOPOFPEYr:LrOLHNrX>On.ALI^r /PA I f M.TNNLW II ,YK! IFRKT^GOAVY 
".KPFLEK.'jLDHPQAPtiKN:'!.!' [ r'{''YU\ .KfiKI PKNI 'KVK: :K'W 'Rf .'I'VF 

No I'.bu-.r homol.Mj pi.-^ni f n> hi. ii ik /t 'MP.I. .i- n\ it///'»H 

YY'-A:;YYLKE:".RFAAKHAPI KNM[.rJ , < 7TTRTI'TAAT[.Ln*KVn'F:AP:m'VOIKM[ l 7rK 

f:TiAVRAK:;rAD'!VATFAi.D::i:L::ro f yyrv[. i aa::ki-ppko': i kmi kppe.tkf 
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CPn_OLOO L2'U8) 127 W. 

CTOli tiyporhetic.il proton 

RTQKKTFILLDLE^IKFLSQLFrRHWPRKVVSLGFAI I IWILVGQSVTITRTLTNVPVR 
LVDLHPDGTV^X;LQKSGFIJWKVSLTTTG^iKNT^ 

KHNLVSVDHE I N t RKD E HSVDAND I FVRLTQYVTED I LLT TTKP IGS P PKGYEYLDVWPK 
YLNQKVSGPKEY INALKEQGLELTFNLNK I SFEELERNR I AOCSHDEI IFP I PKEWKKIL 
rPFEJJTFMDLNDPQADFLRLLFLKRECtPLNL^PVFLFFPVTFICJTMNPLEYSLDPVPP 

ir^-rM ■iiAii'iM.nJ-: .:piir - - u.rrAi' .lflnm. -.e:- nmfw i v 

TKTKETTKLYKKEW 

CPn_0101 129986 129141 

ybbF family hypothec ical protein 

PSTLCNF SQYTTQG PS KTMPFD I TYYTTPLLE 1 1 LI WVMLNYLLKF FW3TRAMDWFGLL 
AFLFLFVLADKLHLPI IRRIJ^HVVNIAAIWFIIFQPEIRLALSRIRFHGKKFFIDTQE 
QFVEQLAAS I YQLS ERG; IG AL WL EKKDS FDEYLS FSSVKINATFS EELLET I FEPSS PL 
HDGAVI LRGDILAYARWLPLAHDTTQL^RSMGTRHRAALGASQRSDALIITVSEENGSV 
SLSRDGLLTRGVK IDRFKAVLRS I LS PKEHKRKPLFSWIWKR 

CPn_0102 130099 131466 

cydA-Cytochrome Oxidase Subunit I 

FY IQFWKFMDAL ILSR IQFGLF ITFHYLFVPLSMGLSMMLVIMEGLYLVTKKQ IYKQOTW 
FWGIFALTFVLGVVTGIMQIFSFGSNWANFSE^nXSNIFGTLLGSEGVFAFFLESGFLGI 
LJL,FGPJ^KVSKKMHFFSTCMVALGAHMSAFWI ICANSWMQTPSGYEMVMHKGKLIPALTSF 
WGVVFSPTTIDRFIHAVLGTWjSGVFLVISVSAYYLWKKRH^^ 

VLQLWS ADVTARGVAKNQ P AKLAAFEG I FKTEEYTP IWAFGYVDMEKERVIGLPIPGALS 
FLVHRN I KT PVTGLDQ I PRDEWPNVQAVFQLYHLM I MLWGVMVALTLI SWSAYKGWRWAL 
KPFFLVILTFSVIiPErCNECGWCAAEMGRQPWWQGLLKTKDAVSPIVQANQIVQSLVI 
FS LVF I ALLTLF I TVLC KKI KHGP E EENDLT EFEVK 

CPn_0103 131465 132511 

cydB -Cytochrome Oxidase Subunit II 

NRG I FMELS LTS LiPI^WYVILGVAVFAYSFGDGFDLGLGA^ LLNS IG 

PVWDGNEVWLV 1 1 VGG LFAGF PACYATLLS I FYMP I WTLVLLY IFRGCS LEFRSKS ESVS 
WKIFWDIIFICSGTAISFFLGTIVGNLILGLPLSPITrSYASLSWILFFRPYAALCGAVVA 
SAFAIHGSCFALMKTSDSLNARIAC^FPYILSSFT^VFYVLFI/jASLISIPKP.FIJAFPTYP 
LLILLIALTSCCCVAAKTSVSKKRYGYAFIYSTLNLLSLILSAATLTFPNILLSTVDPQY 
SYT I YNSAVETKTLKSLLI IVLIGLPFI ITYTCYIYRVFRGKTNFPSIY 

C^rti0104 133884 132676 

Cf017 hypothetical protein 

EK SS3RMLQI SMLLLALGTA INS PA I YAADSQSVS F PEQL PSS FTGEIKGNHVRMRLAPHT 
DGTI IREFSKGDLVAVIGESKDYYVI S AP PG I TGYVFRS FVLDNVVEGEQVNVRLE PST S 
AESVRLSRGTQIQPASQEPHGKWLEWLPSC^/FYVAiCNF^ 

AhribLINSAI^FAHIELEKSLNEIDLEAIYKKI^VQSEEFKDVPGIO^LIQKALEEIQDA 
YSSKSLESQNTS IASSQCSTPKVSSSEVTTSLLSRH IRKQTALKTAPLTQGRENLEYSLF 
Riti^ASMOXJGNDHSEALTQEIAFYRAEQKKKQVLAGVLEW IA 
FCfYOTS I NLEQWLG KR VTVECL P R PNNH F A F PAYYWG I KEAS 

CPtuOIOS 134883 134029 

CfOfifce hypothetical protein 

YVFpRKFSNQNPMLL I YCKKKE I HLQWPQTAK IRFT PKI AMKVK INDQL I C I P PF I SARW 
S(3|SF I ESQ EG ENKDQGTLRLHL IDGK 1 1 S I PNLDQ S 1 1 D I AFQEHLLYLETSQSGKEDS 
RdKDKLX^GVLMNVLC^ITKGNDIQ 
' DVLEKMADVIRVLSGNNATLiPRPEPHCN^ 

QqGpKLYIVTNPLNPSDQFSVYLGPPIGCTCGEPNCEHIKAVLYT 

CPn^0"l06 135073 136374 

pHaff-ATPase 

EK^TQMKKTMVIDTSVFIYDPEALFSFENTR 1 1 1 PFPVIEELEAFGKFRDESAKNASRA 
LSNTRLLLENAKTKVTDGVLLPSGSELR I EVAPLSNDDRRGKLLTLELLKI I AKREPMVF 
VTjKgLGRRVRAEALQ I ESRDYESKRFS FRSLYRGFRELQVSQ ED I ENFYKNGYLDL PLDV 
VSSPNEYFFMSAGENHFALGRYYVSEGKI I ALKAMDKSVWG I KPLNTEQRCALDLLLRDD 
VKlJ^LIGQAGSGKTILALjAAAMHKVFDKETYNKV^ 
HWM^PIYDNMEVLFSINQMGNSSEALQALMDAKKLEMEALT^ 

LTP^EI KT I I SRAGKGTKI VLTGDPTQ I DSLYFDENSNGLTYLVGKFHHLAL YGHMFMTR 
TERSELAAAAAT I L 

CPn_0107 137321 136392 

CT058 hypothetical protem_l 

KKSPPPVTPKEIPTQPKPPIPQRPEVSPTPTDHIVPGSIEASPILGKKPSPDSMVSPLSL 
FHKMLLENWTPVEEPFPWPPAEKNQK I FAWALNQS KLI FVSTSGNI AQPRLVTDSMSMM I 
VN AAN RTK S R DG AGT NO VL S AA VSVD SWG LSQ R P LN P ERQGT PLN EG EC RAGMWRN ADG S 
NHTGKQGKPHYLAQLLGPKAVDHHNKSQAAFDRCKNAYLNCFSLAQTLGVTFLQ I PLISS 
GIYAPPENRKKPNSEENKVRMRWIHAVKCALVAAMQEFGNEPGNTDRRMLI^/LTDLKTPA 
ITDPKKKSHL 

CPn_Q108 1378S7 137303 

CT013 

KNLFHY KA I LMS I FNEEVF I I SHRHT PLGQTSTALRNT PLVNPLHRTNLQR I ASY I P I FS 
TFIGIKTLKGISSLQYSMVLMTGNFSSVCKTLPCPEIYEELPKVRKEAWLEIFGIKALYY 
LVLGV I K I I KL I VRYLCPCCR PPEPR E PQNPLT PT PLDMGQQ I DA I FST PT3 PTGFKDPF 
LDDLLQEDKKKAPHL 

CPn_0109 13864b 141783 

L LeS- r.soleucy 1 -tRNA Synthetase 

RQ KMTA DEVG KM S FAK K E EQVLK FWKDNQ I F EK S LQN ROG KTLY S F YDG P P F ATGL PHYG 
HLLA.OT IKDWCRYATMDGYYVPRRFGWDCHGVPVEYEVEKSLSLTAPGAIEDFGIASFN 
EECRKtVFRWHEWEYYINRrGRWDFSSTWKTMDAGFMES\A^FQSLYNQGLVYEGTK 
W PF f .i T A 1 jGT P L:j N FEASO NY K EVDD P S L WRM P LQ N DS AS L LVWTTT PVJT L P S NMA I A V 
'Jf-rrLWVREODKK^GEQWtLSCxKVSRWF^NPEEFVILESFSGKDLVGRTYEPPFTFFQS 
K R EEG A FR V [ AA3 FV E ES EGTG WHMA P A FG EG D F L VC K ENHVP LVC PV DAi IGf> FT EE I P 
QYOOOY I K! IADKE I IKFLKKEGR 1 FYHCTVKKRYPFrWRTDTPL t YKAVNCWFVAVEKI K 
HKM LP AN: '> 1 : 1 H WV P EM 1 0 BO R FG K WL Ft ^ A R DW A I : J R NR Y WCT P t P I WKSAETj Z I LWGS I 
HELEELT' 71'Q [TDIHRItF [DDLN IVKDOKPFHR I PYVFDCWFDSGAMPYAQUHYPFENQK 
CTEEAFPACF r AEGLDOTRGWFYTLTV r.'A L l.TDR PAFRNA E VNG I I LAEDrjfiKMSKRLN 
NYr*:;L J KWLr/rYf;AUAl.RI.YLLHiJVWK \EDLRFSDKG I Ef WLKQI LLPLTtr/LSFFNTY 
AKL,Y'M r DPK:XjD [tlPAYTE [DOW[L:JNLY , 'Wr 1 KVRF/;M^OYHLNFA\/EPF/TFrDDLTN 
WY IM<< 'RI^RI-'WEAF.OTrni-lRAAF^TLYniVLTVrCKV I ArrVPFLiAEDEYQKLKLEKEPES 
VIU.Cr J r-'lvVr-:MIjKI[.pnr ( CKRMHD[Rf:[VCU^l.U J RKCHKr.KVROP[ANF , r/'/GSKDRL^ 



LLKT FEGL I AEELNVXNV I ~/ E EA P ' * K I v TTV K PN F R MLG K K VG : J KKK EVCK ALS E L PNN 
AIDKLIQEETWVX.TIDDPEIALDGDDW I>'-*PHTDF<jY I ARo.SALFSVtLDCQLREPLIVE 
GI ARELVNKINTMRRNQQLrT/IDR IALR I KTTEAVHRAFLDYENY I CEETLI IAYDFTQD 
SDFQGENWD INGHATQ t E IT/33 1 DS 

CPn.OUO 14J7SS 14 1827 

lepB-Signal Peptida-e [ 

LSYPSIFMKOHYSLNKSPHILRSTYKLLKSKKLAHSPADKKCLQELLEQLEEAIFEHDQE 

*\ *rr v^\: \r '* f 7v^ — F agw \p\ pv™ ^/ptg ^mp y z 

•■ ^'i',': ^rr r '-*( ' *n "'• ^;:a^. .PL: i i daltK'*!^ ilipik 

KRYLKRCMGRPGLrLit : . l x ' .L-l. \of ^ I £rr:J v huLU*- ihVPY L JFUJ l"^ JHTL 
GQKTI IDFKQFNQSYGRLI rPQTSMYGOFFDHKEWHQDEPNKLKDPHLSPVSYADLFGMG 
WAl^ILTEHQARTSHLLPNPGSPTK^^'LEICHTANLSYPKPT^HYEHQLSPAIOPMK 
TLLPLRKEHLHL IRNNLTT3P F IVAOGCAYKYHQFK INTSG I AKAYAILLPKVPDGCYEY 
SKGEAYQ IGFGE IRYKLK3CH PLTQLNDKQVI ELFNCGINFSSIYNPVNPLQAPLPNRYA 
FFNQGNLY IMDS PVF I KND PTLQKFVTSETEKQEGS S ETQPY I AFVDKG LPPEDFKEFVE 
FIHNFGIQVPKGHVLVLCDrft'PMSADSRErcFVPMENLL^^ 
PTTLSGYLVSG IALATGLSLIGYVYYQKRRRLFPKKEEKNHKK 

CPn_0111 144761 143934 

CTQ21 hypothetical protein 

QLQmYPIMPNDSSTYFERII^KYIJIKKCCKTLFLFLFLSFLFSTAFSGLFASCTSSLRT 
IQENI FLAKTGDYTVLSRGSQRTFVLVKSTTPKTVWIEI I HFPC IAHKERPSLEQASWKT 
VIHQLESPSQVFWSLSSEGSQFFSLNTRTKSLEPVGKSTTVPAFLQIFDLPLSPAPANV 
IKTKGKENKPWSPKVSFEGAPLTSISVNAWCGLWPKDRGPLSETTGIOfYFTQPDISVFPL 
WVS I ETPKGTS IVRAVD IGKGATS PYVYSLPDSKTQ 

CPn_0112 144743 145093 

gatB-<Petll2) Glu tRNA Gin Amidotransf erase (B Subunit) 
DSDFGVVNMKKNTHPEYRQVLFVDSSTGYKFVCGSTYQSEKTEVFEGKEYFVCYVSVSSS 
SH PF FTGSKKFVDAEGRVDKFLKRYSNVROPAQQPQ PEEDAL PAAKG KKKWTKKKK 

CPn_0113 145329 146405 

pfrA-Peptide Chain Releasing Factor (RF-1) 

GFmKKVAEYLNRLAEVEIKI SNPEI FSNSKEYSALSKEHSYLLELKNAYDKILNLEKVL 
ADDKQALAI EKDPEMWMLEEG INENKYELEKLNK I LESLLVP PDPDDDLNV IMELRAGT 
GGEEAALFVGDCVRWHLYASSKGWKYEVt^ASESDL^GY^ 

GTHTiVQRVPETET^^RVHTSAITIAVLPEPSEEITrEIXINEKDLKIDTFRASGAGGQHW 
VTDSAVR ITHLPTGVVVTCQDERSQHKNKDKAMR I LKAR IRDAEMQKRHNEASAMRSAQV 
GSGDRS ER I RTYNFSQNRVTDHR IGLTLYNLDKVMEGDLDP I TTAMVSHAYHQLLEHGN 

C?n_0114 146371 147261 

hemK-A/G specific methylase 

VMPTTSYSNMEIKKAIQEGTAYIJ^Y^GVPLSIXrEALYILMDLLEVSSRAKL 

MLMEYR KRiiALRGQ RC PTAYLNGAVS FLGLRLRVDS RVL I PRT ET ELLAEY I INYLLSHS 

E I QTFYD ICCGSGCLGLAI KKSC PHVEWLSDVC PQAVAVANENAKShJGLDVK I LLGDLS 

APYTRPADAFVCNPPYLSF^IIHIDPEVRCYEPWKALVGGSTGLEFYQRIAOELPKIVT 

STGVGWLEIGSSCCES I KN IFSKHGIYGRLHQDLSGRDRI FFLEMDGRDPVSSGAYS 

CPn_0115 147279 148622 

ffh-Signal Recognition Particle GTPase 

MINSLSQKLSSI FSFLVSSRRINEENI SES IRE^RLALLDADVNYHVVKDFISKVKEKIL 
GEEIWKHVSPGQQFIRCLHEELVAFLSDGREEFTIQKTPSIIIXCGLQGAGKT^ 
DYV I KNKKAKKVLWPC DLKRF AA VEQ LK I L VAQT KAEFYQS Q ENK P I NVWKALA Y AKE 
NG HDFV I LDT AG RLNI DNE LME ELTA I QKVSQ ANE RL FVMNVAMGQ DVLATVQAF DQS LD 
LTGVILSMTDGDARAGAVFS IKHVLGKP IKFEGCGERIQDLRSFDPQSMAERILGMGCrr I 
NFVK EM REY I S E EEDAELG KKL VTAAFTY EDYYKQMKAFRRMG P LR KLLGMM PGFNNAK P 
SQKEIEDSECXJMKRTEAIILShfTPEEPJCELVELDMSRMKRIASGCGLTlXSDVNQFRKCMS 
QSKKFFKGMSKGKMEQVRKKMSGGNOWR 

CPn_0116 143592 148972 

rsl6-S16 Ribosomal Protein 

E KNVRRKSVALK I RLRQOG RRNHWY RLVLAD VES PRDGKYIELLGWYDPHSSINYQLKS 
E R I FYWLE RG AQ LS S KAEALVK QG APGVY S AL L S KQ EAR KL WRKK RRA YRQRRSTQRE E 
AAKDATK 

CPn_0117 14S983 150071 

trmD-tRNA (guanine N-l ) -Methy It ransf erase 

TGMKIDILSLFPGYFDGPLOTSILGRAI KQRLLD VQ LTNLRDFG LG KWKQVDDT PF SGGG 
MLI>tAEPVTSAIRSVRKENSKVIYLSPOGALLTAEKSRELAAASHLILLCGHYEGIDERA 
I ESEVDEE I S IGDYVLTNGG IAALVL I DAVSRF I PGVLGNQESAERDSLENGLLEGPQYT 
RPREFEGKEVPEVLU5GDHKAISQWRLEQSERRTYERRPDLYLNYLYKRSIDHKFDEETT 
TNRDHFKCDK ISWLEVNKLKRAKNFYCKVFGLDAMSCENKFCLPHEGKT I FWLREVQAE 
KKN rVTLSLSLDCACEEDFCYLLRRWELFGGKLLEKQADEHAVWALAQDLDGHAWI FSWH 
RMK 

CPn_0118 150075 150464 

ril9-L19 Ribosomal Protein 

KKENFRWY I MVNLLKELECEOCRNDL PEFHVGDT I RLATK I S EGGK ERVQVFQGTVMARR 
GGGSG ETVS LHRVAYG EGM EKS F LLNS PR IVS lEIVKRGKVARARLYYLRGKTGKAAKVX 
EFVGPRSSKK 

CPn_0ll9 150520 151164 

rnhB-Ribonuc lease HTI 

LMNTS I SE I QRFLSM I AFEKELVSEDFSWAG I DEAGRG PLAGPWAS AC I L PKGKVFPG 
VNDSKKLSPKORAQVRDALMODPEVCFG rGVISVERIDQVNI LEATKEAMLQAISSLPIS 
PDILLVDGLYLPHDIPCKKIIOGDAKSASIAAASILAKEHRDDLMLOLHRLYPEYGFDRH 
KGYGT^ LHVEAI RRYG P^ FCHRKSFSPI KQMCAI V 

CPn_OL20 1^1125 151778 

gink -fJMP Knust 3 

Er; L F r ; N K/V>JVC YC MNK [L\ ?: I PFC> PDf KJKCC PKLFT I APAGVGKTTLVRMLEQEFSSAF 
A ET I '; VTT RKPREGEVEXIK *^Y H FV: ; H E F FQ R I - f.DRQALLEWV FLF( EC YGTSM L EIERTW 
f ;:/;KI[AVAVlD[0(*ALFtR:'RMP:";V; [FfAPnOFELERRLASRG.'TEEG.^ORKERLEHSL 

iKi^AANOKrjYvr tndduv ^YRVLK.'; i f iakeiirhtl 

';i*n_'Jl.!l IMVf.O IS.MJi.H 

t TO U hypor her l< a 1 i-toff in 

rill rMlKKDPF-PNCKLNKLi*'^ 'PFSLVrtYA f K<_>AI< I Y EAKf KNVWXVNh 1 LTLVLLDREG I 

Oi'KriTiii'yvTAi-iTVKKKK *i :htn: ;pkkdi " :aytv/:jdvk 
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'.Trunin i r > is>7.:j 

mctG -Mor h iony L -tRNA oynthtjcas*? 

GKVMPQKVL [TSALPYANCPLHFGH I AGVYLPADVYARFRRLLGDDVLY ICGSDEFGIAI 
TU4ADRBGU^YQETVI»IYHKLHKOTFEKLGFALDFFGRTTNPFHA£LVQDFYSQtJCASGL 
rENRrSEQLYSEQEORFLADRYVECTCPRCGFDHARGDECQSCGADYEArDLIGPKSKIS 
GVELVKKETEH3YFLLDRMKDALLSF IQGCTLPDHVRKFWDYI EHVR5RAITRDLSWGI 
PV PDFPGKVFYVWFDA P TGY ISGTMEWAA3QGNPDEWKRFWLEDGVEYVQF IGKDNLPFH 
:TWFPAMELG0KLDYK^DALWr;EFYLLEGRQF3KSEGrm^)MDKFL33YSLDKLRYVL 

Ai ?:r.:;-.. ;■:■"! *.;>■"• \ :r: v\ckni! v :>" " ' ~WLCDr:DP 

.* \r' r ■'■•!."."< -"AT VI'1 .\~ YH l JC I - , APW«'" : /IVEPVEAI 

LFCACYCQKLLALI3YPIIPE3AVAIWEMIJPKSLENCNLDTMYARDL.WKEEILDVINEE 
FHL fCS PR LL FTTV E 

CPn_0123 155975 153774 

recD-Exodeoxynbonuc lease v (Alpha Subumt) 

NSMEK I CGYL EQ I LVENKDSG D ITAY I K I PNKTTP I L I KGKLPQPLELGS P I Q IYGVWS H 
SPSNTKYFQIHSYDSPLLYEYRGVFHYLTSKLIKGIGPKIAEKIIEKFQEKTCYVLDITP 
ERLSEVSGISETRCVSICKQLCEQKILRKTLLFLQEYNIPIHYGVRIFKKYQEKSIEKIC 
EDP FLLAREMEG IGFKTADF I AMKLGVPRNS ESRLC AG I QKS LEELQEEGHTCYP I ELL I 
DWAKLL^QDVF[XrPITLEEII?TQII*NMQKRKIJ^ 

LKRILFSSRRIRSIDGEKAIAWVEENLSIDLAEQQREAIKACFSEKLLIITGGPGTGKST 

I TQA I LK I FEQVTHK 1 1 LAAPTGKAAKRMTE I TQKHSVT I HALLQ.YDFKTKS FRKNHDNP 

IDCDLIIVXJESGMMDTHLLHHFLjCALPDYTTLVFIGDIHQLPSVGPGNILKDLITSNKMT 

VI RLNK I FRQWDSGIVTNAHRVNEGELPILYSETGRflDFLFFQKDDQEEALNHI IHLVT 

KFVPQKYHIYPQDIOVIJVPMKKGTLGIYNlJ^IKALKHALKPKKANIJiGRFQSYA 

I RNNYN KEV FNG D IG YVST I NF ED KA VWRME GKHVGY S F S E LDDL VLA YAT SVHKYQG S 

ESPCIIIPIHTSHFMMLYRNIJjYTAITRGKKLVILVGTKKAIAIATRNNRVQHRCTGLAE 

VL KEL DTKKNYAD L 

CPn_0124 156575 158063 

No robust homolog present m Genebank/EMBL as of 11/7/98 
IRSKQRWAITLLVLGILLIASGIIFLAVAIPGLSSAVALGLGCGMTALG^ 
L I R S EKLALEQVE I KQ ARTR VNNELDQL S Q YVFYT ENVLDNLKRWSYRDLG FVRQAQ E EV 
TNLEQD I EE I FLTLRDIRNALDNEEFFMTHAKCCLAQVGESLFQDAS IDEFINLAHLSEI 
RQHLDINDPRWSMITKKVKGTWRFIWSTMYKQIKS^EKSDFGQI^^ 
VL YO S FQ KG YNRAALL S EKTR 1 1 HT S S LLHWEKDEDKHLN I KNECASRLENF KKFRTLF L 
GL S E EDV I D FTGAS GWBCS KL P RKEVP LDGG KKKLRFKRTF AD EQ VGDWDRTT SLEHMT P 
* QEEDPLDRLMDQVEQEATSVLKDQDRYWKEIETSEAKFRSLPREDDFEKQSQIDSYIRDL 
DDHLSWANQLSAAEDALIEVTDVQEHGNREMUCNIGXX3LELIEDAVKATLPRVDFIQEL 
LEKEELPLVAARMSLENS 

CPfdbl25 158072 158605 

NOjjiTjpbust homolog present in Genebank/EMBL as of 11/7/98 

KI SltAEIMSEVKPLFLKNDSFDLATQRFQKL INMLQEQAEIYNEYEEKNARVQNE IKEQ 

KDFplirKRCIEDFEARGLGVLKEELASLTRDFHDKAKAETSMLITC 

RQE^tl^KMAERYRIXKQVLEAVQVEQK 

CPn5pi26 158806 161085 

. No| : fe©bust homolog present in Genebank/EMBL as of 11/7/98 
EiyF^YYCMGLFFFSGAISSCGLLVSLGVGLGLSVI^ 

AP1DLLDLEDAS ERLRVKAS RSLASL PKE I SQL ESY I RSAANDLNT I KTWPHKDQRLVETV 
S RK&ERLAAAQNYMI S ELCE I S E I LEEEEHHL I LAQES LEWIGKSL FSTFLDMESFLNLS 
HLSE^FYLAVNDPRLLEITEESWEWSHFINVTSAFKKAQILFKNNEHSPJ4K^ 
ELtltF IYKSLKRSYRELGCLSEKMRI IHDNPLFPWVQDQQKYAHAKNEFGEIARCLEEF 
EKT FFWLDEECAI SYMDCWDFLNES IQNKKSRVDRDY I STKKIALKDRARTYAKVLLEEN 
PT^EGKIDLQDAQRAFERQSQEFYTLEHTETKVRLEAI^QCFSDLREATNVRQVTiFTNSE 
NAWDLKESFEKIDKERVRYQKEQRLYWETIDRNEQELREEIGESLRLQNRRKGYRAGYDA 
GRpgLLRQWKKNLRDVEAHLEDAT^iDFEHEVS 

I E ELLS YEERC I L P I RENLERAYLQYNKC SE I LS KAKFF FPEDEQLLVSEANLREVGAQL 
KQYQGKCQERAQKFAIFEKHIQEQKSLIKEQVRSFDLAGVGFLKSELLSIACNLYIKAW 
KES IPVDVPCMQLYYSYYEDNEAWRNRLLNMTERYQNFKRSLNS IQFNGDVLLRDPVYQ 
PEtSHETRLKERELQETTLSCKKLKVAQDRLSELESRLSRR 

CPiCBl27 162152 161130 

yt £S*^Cat ionic Amino Acid Transporter 

ESFHF'PSANQESRTRNVPLG I FHGLVACLYWG IVFVI PNFLGSFGDLD I VLTRYT I FG I F 
SLIAtSAIKNPSVIKKT PLYIWRKSLLWTLLINPVYYFG I TLG I RYVGSAITWI ASLAPT 
AVLYHSNTKQKELPYSLLFAI SSVI I TGVI LTHLSALNLPTAASPLYS I LGVIAVILSTS 
LWIYVIRNQSLLEKHPNLTPDTWSYLIGISALIICLPMIIILDLCGITHVTHNLISHTP 
GSERLLFLLLCSAMGIFSSAKALIAWNKASLNLSPALLGAILIFEPrFGLVLTYLYSQSL 
PSLQEG IGI FLMLGGSLLCLVLFGRKVQKSLENSQVSSSNE 

CPn_0128 162262 163Q53 

bpll-Biotin Protein Ligase 

EDRGRMLRNQVLVYCSEGVSPYYLRHTrRFLKYYSTQEGAFDILRVDGNFLIKNPFWEET 
TRLLVFPGGADRPYHRVLHCLGTARIFQYVSEGGNFLGICAGAYFGSKMIYFYEPEGAPL 
QG ARDLGFFPGTAKGPAYRGNFSYVSPSGVRVSPQLFSDFGLGY AM FNGGCFFEGSEGYP 
GVN IE3RYDDLPGKPASIVSRI VS KG LAVL SG PH I E YLP HYC RMVK ENVQKT R EFLQR ER 
TTLDRYCQNLVQRLRQPAFSKADC 

CPn_012 9 16 37 47 lt>3064 

similarity Co CT036 

DEQY I LSH IHMDPR EFVTSEPLQKTYQKLOEKHVNNLGI ASQVGLTDLQNKTQYENNLIE 
TTTNErTYYFPWHNPDILRSEWDPrSNOLYLIFKKFFIHYHNLFSTALERKQILLIDSL 
NTG3SMPTARQMELLAFLCVFEQLDYNEDEYTIEPRDYFNRFVYKNS0TAPQIQSFGLLH 
GYEEMS Y ASNN I RNVLTHS IVLCSP I LYQL ITEFDTTK I HADDFDCLI 

CPn_0130 164251 LbJ751 

No robuur homolog present m Genebank/EMBL .is ot 1 1/7/38 
:;:3MVKC:;:;[rHENKKPAOLLPESKFAACTKL3LArL':LFLG[AACILIALSGLLrNTLLI 

rALr'Lir; 1 i vl::tg i sll eotqc3kgvc>kdeqkpk i r fpkf-tp.ildpwllnplknk tqss 

ETLLLDPTfJ INLKNEI.I'FPIJFEEWKK EFLKDPDFLIK3ALANWK ILE 
<:Pn_'Jlil U.4441 lt*V,?in 

N<» rohu:;r h.mioUuT pr»-^ur in l WifLwik / TMHL .p. «>t I :///■»« 
I-'::::LKKKRF:JL::LA[[ MFFFT^AYVF^ rc FLfLFMf ;NAM: \' V If 'VYNi JP'IW ! LKT3VAQ 
KVKKKMGK^fOVr.UI'rSVMt.FIGrjJVrAl-rrPOYLIVFVT.TtALLML^l^LVf-FLLIRSV 
Ul :: :MVDKI W: IRK^YAl ,1 IijI IKNt IPKI JJVKPVjO I LLK, L'Y L KVHALWL 1 : K \\j [i'EDPUOAA 

vt.Luuwri-T^svnvi-Au.i";! oekegky ua-viw:. :h [iw:r.r,vn..;AfTLDDLNEO 

« 'VMW ,Mt INEKFLFF [ NKKARFI IGI^DLKHE rM'^U.F.rrGVE^LDP-iM'iroV^AMFlSVYRY 

i.Hf>ijr;r['::Ki.ucM ;, iir.[.::i , i , K(:nvvn( , LA : f en r k d i .ai x . d f t . f ai - k nv e w ; k f 1 3 ac ek 
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ALLKNPQCrSrKDLKQFLVs 

CPn_0Li2 1*55584 Lt>65bl 

No robust homolog present in Genebank/EMBL as ot L 1/7/98 

3M I EFAFVPHTSVTADR I EDRMACRMNKLSTLA ITS LCVLI S SVC IMIGILCI SGTVGTY 

AFWG I IFSVLALVACVFFLYFFYFSSEEFKCASSQEFRFLP I PAWSALRSYEYISQDA 

rNDVIKDTMQL3TLSSLLDPEAFFLEFPYFNSLlVNHSMKEADRLSREAFLILLGEITWK 

DCETKILPWLKDPNtTPDDFWKLLKDHFDLKDFKKRIATWIRKAYPEIRLPKKHCLDKSI 

, K r r ,. : ^r^/---^-- - 'in"/ryc^rppA 1 ^7 i otG3^T^n"GL?WP^r!r J TW'>fF 

MI *J^i * ' :.. ' ■•>■'- * V'!. 

CPn_0133 167349 166564 

CHLPS hypothetical protein 

NSSAYMFKLLKNLFL IGCC I VGYFWMRKES IVEQWLSNRLHTQVTVGRVS IRTSGI KIRH 
I C I HNP LAS ERFPYAAE I EY ADVRFS S I SMLLTKQLE I S EL 1 1 HGANFT I FPYDSHGTKT 
NWSLVWKNFHPQKETPSNLWIDRAPVLIRRCLFLNTRLYGLRANHKDIPHLSVPSLEFHS 
HTSSAKELPKLSEALPSLLYLALEESLYHLNLPGD I IKPLSCa^AHKHFYSSYPQFQDRLN 
DINTPGTPTEEIIGFIRGLFFH 

CPn_0134 169131 167467 

groEL-HSP-60 

FADYRJCUWTTMAAKNimEEARKKIHKGVKTLAEAVKVT 
TKDGVTVAKEIELEIIPCHENMGAQMVKEr/ASKTADKAGDGTTTAW 

GANPMDLKRGIDKAVKVWDELKKI SKPVQHHKEIAQVAT I SANNDSEIGNL IAEAMEKV 
GK^SITVEEAKGFETVLDVVEGMNF^GYI^SYFSTNPETOECVLEDALILIYDKKISG 
I KDFLPVLQQVAESGRPLL I IAEEI EGEALATLWNRLRAGFRVCAVKAPGFGDRRKAML 
EDIAILTGGGLVSEELGMKLE^^^^LAMLGKAKXVIVTKEDTT IVEGLGNKPDIQARCDNI 
KK Q I EDST S DYD KEKLQ ERLAKLSGGV AV I RVGAAT E I EMKEXKDR VDD AQ HAT I AA VEE 
G I LPGGGTALVRC I PTLEAFLPMLANEDEAIGTRI ILKALTAPLKQIASNAGKEGAI ICQ 
QVLAR S ANEGYD ALRD AYTDM I DAG I LDPTKVTR S AL ES AAS I AGL LLTT EAL I AD I P E E 
KSSSAPAMPSAGMDY 

CPn.0135 169448 169143 

groES-lOKDa Chaperon in 

MS DQATTLR I KPLGDR I LVKREEEEATARGGI ILPDTAKKKQDRAEVLVLGTGKRTDDGT 
LLPFEVQVGDI ILMDKYAGQEIT IDDEEYVILQSSEIMAVLK 

CPn_0136 171419 169569 

pepF-Ol igopept idase 
KGVPSLMTTELKTEALPTRTQVDPKHCWDTT^ 

SPSHYQIDNPESLLELI^KKFSVERJ0J>5LYIYAHLIHDQDITOTEGESDYQSIVYLYTL 
FSQE I SWIQ PAL IALS EEKVAALLSS SVLAPYRFYLEKI FRLS PHTGTANEEKI LAS SFA 
ALNVSNKAFS S LS DAE I PFG I AKDSNGEEHPLSHALASLYMQS PDQ ELRRTAYLAQFQRY 
YDYRNTFANLLNGKVQAHLF EAKARNYPSCLEASLFQHNI PTTVYI NL I NETKKHTSLIN 
RYFT^LKKEAIJJLKEFHFYDVYAPISQTTSKNYSYTIEGVDLVCKSL^ 
LSNRWVDRYENKHKRSGAYSSGCYDSAPY I LLNYTNTLYDVS VI AH EAGH SMHSYFSREA 
QPYHDAQYPLFLAEIASTFNEMLLMEALSKSDQSKEDKIVI ITKTLDT I FATLFRQTFFA 
AFEYEIHSAAECGTPLTE^FI^ATYGNI^KEFYGGWrSDSLSALEWARIPHFYYNFYVY 
QYATG I IAALSFAEKILTQEPGALELYLKFLKSGRSDFPLNI LKKSGLDMTTSAPLDKAF 
AFITKKIDLLSSLLSED 

CPn_0137 172263 171502 

ybgl-ACR family 

VCSMNVADLLSHLETLLSS KIFQDYG PNGLQVGDPQT PVKK I AVAVTADLET IKQAVAAE 
ANVLIVHHG I FWKGMPYP ITGMIHKRIQLLIEHNIQLIAYHLPLDAHPTLGNNWRVALDL 
NWHDLKPFGSSLPYLGVQGSFSPIDIDSFIDLLSQYYQAPLKGSALGGPSRVSSAALISG 
GAYRELSSAATSQVIXFITGNFDEPAWSTALESNINFLAFGHTATEKVGPKSLAEHLKSE 

FPISTTFIDTANPF 

CPn_0138 174094 172700 

"hemL-Glutamate-l-semialdehyde-2, 1-ammomutase" 

TNSRLFLAI KDQ LLQNMWKLTKRNSMLNC SNQ KHTVT FE EACQVF PGGVNS PVRACRSVG 

VTPPI VSSAQGD IFLDTHGREF IDFCGGWGALIHGHSHPKIVKAIQKTALKGTSYGLTS E 

EEILFATldLLSSLKIJCEHKIRFVSSGTEATMTAVRIjARGITNRSIirKFIGGYHGHADTL 

LGGISTTEETIDNLTSLIHTPSPHSLLISLPYNNSQILHHVMEALGPQVAGIIFEPICAN 

NKJIVLPKAEFLDDIIEI^KR^SLSIMDEWrGFRVAFQGAQDIFNLSPDITIYGKILGG 

GLPAAALVGHRS ILDHLMPEGT IFQAGTMSGNFLAMATGHAA IQLCQSEGFYDHLSQLEA 

LFYSP I EEE IRSQGFPVSLVHQGTMFSLFFTESAPTNFDEAKNSDVEKFQTFYSEVFDNG 

VYLSPSPLEANFISSAHTEENLTYAQNI I IDSLIKIFDSSAQRFF 

CPn„0139 174686 174093 

yqgE 

S PTKNKLRDIMK I PYARLEKGS LLVAS PD INQGVFARSVILLCEHS LNGS FGLI LNKTLG 
FEI SDDI FTFEKVSNHN IRFCMGGPLQANQMMLLHSCSE IPEQTLE ICPSVYLGGDLPFL 
QEIASSESGPEINLCFGYSGWQAGQLEKEFLSNDWFLAPGNKDYVFYSEPEDLWALVLKD 
LGG K YA3 L3TV PDNLLLN 

CPn„0140 175140 174673 

yqdE 

PRSNQQKIFCMSLEKELLEETPLVLLNFYKLVSFCNYAGMILGTEEKKFAIYGHVSMGQA 
FQGADTEGHSPQRPFAHDLLNFVFSGFDIQVLRWINDYKDNVFYTRLFLEQKDREFLYV 
VDVDAP PSDS I PLALTHK IP ILCVKSVFDAWPYEE 

CPn_Q141 175817 175110 

rpiA-Ribose-5 -P Isomerase A 

HSSSAVEKDLHLHEKKCL.\HEAATQVTCGMILGLGSGSTAKEFIFALAHRIQTESLAVHA 
[A3SQM3YALAKQLA I PLLNPEKFSSLDLTVDGADEVDPQLRMI KGGGGA I FREK I LLRA 
AKRS l IL'/DESKLVPVLGKFRVPLEISPFORSAI IEEIRHLGYEGEWRLQDTCDLFITDS 
3NYIYD[F3PN3YPNPEKDLLKLIQIHG"/IEVGP/rEKVEVW3SNSQGLI3KKY5V 

<"Pn_i)t4^ I.V121 17581^ 

No rnb>r,r hnmn Loii pro<:^nt in Gf*nf-t>ank/EM8L <J f " ot Li 7/18 
rJH'^JKVFGTLEKFUFKILOLLCTKNfJrLNFG^HFEr^RV^HDNAtQKER^YPI^KPtAEN 
R [NTL.':FKDLKIDYPKO^^KRFPFLY r : [OP [ P LE f K LW F. Y FVT 

f*r'ri_UL4: W.' J47 l/f,? M 

**yx jG.Lj'^.L Hyporhot ^.al Proff-m 

R R ' l' { MI IT 'XK R P LK : ! H F DVVC3 F LR P E i I LK KT P E3 L K EG 'j 1 3 LDO LM(J LFDIAIODLIKK 
fjKA/V iL.^i'' rTD.XE'RRA1TVl [YDFflWf IFIKj'/GfiMPATEGVFFDf IERAM IDD'rYLTDK [3V3 
HHIM'VrjMFKFVKALEUFFTTAKOTLr'APAOf'LKOMI FPriN [EVTRKFYf'TtsIOEI. TEDCVA 
GYRKVri'JjLYDAGt.'RYt^LPDfTI'GCr.VLI'H'A r.WfG I DEKOLQDt, LQQYLL INNLV LAD 
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RPDDLVYNLHVCRGNYHGKFFAIjGGYDFT AKPLFECTNA/DGYYLEFDHER3GDF3PLTFI 
3CEKTVCLCLVT3KTPTLENKDEV tAR [HQAADYLPLERLGL3P0COFA3CEIGNKLTEE 
BQWAKVAt.VKErSEEVWK 

CPn_0M4 L77942 LB05hO 

cLpB-f;ip Prof ease ATPase 

KVLiGWFMEKF3DAV3EALEKAFELAK3SKifrY"/TENHLLI^LLENTE3LFYLVIKDIHG 
N Pf 1LLNT AVK DA LGR E PTWEG EVD P K PS PCLCTLLRDAKQEAKTLGDEY I SGDHLLLAF 

.■/ ' 'r'tr VW.'.'.'ir '''I "/ fJT, "l* . ' -V'! V ' I ; ."I ,PK :' "<NIj r\L\F ^ 

■.■":*;r'!;i':i-:i i'J'T [fjvr.. iv : rirn n'.: ;r: ■ '^rv ^ •':\.Ai.^' i :^'i-'\Trr r Wi< 

KQ LYVL DMGAL I AGAK Y RGE F E ERLK3VL K DV E3GDG EH 1 1 FIDEVHTLVGAGATDGAMD 

AANLLKPAI^GTLHCIGATTLNEYQKYIEKDAALERRFQPIFVTEPSLEDAVFILRGLR 

EKYEI FHGVRITEGALNAAVLLSYRY I PDRFLPDKAIDLIDEAASLIRMQIGSLPLPIDE 

K ER ELAAL I VKQ EA I KREQS PSYQ EEADAMQKS I DALREELASLRLGWDEEKKLI SGLKE 

KKNSLESMKFSEEEAERVADYNRVAELRYSL I PQLEEEI KQDEASI^QRDNRLLQEEVDE 

RL I AQWANWTG I PVQKML EGEAEKLL I LEES LEERWGQ PF AVSAVSD S I RAARVGLND 

PQRPIXVFT.FLGPTGVGKTELAKALAI3LLFNKEEAMVRFI*£ 

VGYEEGGSLSEAIJlRRPYSWLFDEIEKADKEVIjNIIiQVFDIXjILTDGKKRKW 

F IMTSNIGS PELADYCSKKGSELTKEAILSWSPVLKRYLSPEFMNR IDEI LPFVPLTKE 

D I VKIVG IQMRR I AQRLKARRINLSWDDSVI LFLSECGYDSAFGARPLKRLIQQKWILL 

SKALLKGD IKPDTS I ELTMAKEVLVFKKVETPS 

CPn_0145 130717 182369 

CT114 hypothetical protein 

NCAASFIWLM<SSHRNLRSPMFKSFIWYMFVGGLVSFLLPIPDLECANNVTKTYDKKAS 
VISRDLKLQEDCQKFWNLDPYKLESLCAYQVLYHDDYSSKRIRELFPQIQKDEVPIFATM 
I LTLGKVDRGFS PEEI SL IQKLSYPGLSLASLRGSTE IDPNTDLARALWSEFSGDLGKN 
RADYYSNCLDII^RIHAERQRYr^SPCVPGTSEFHKATIEAINTILFYEEAVRYPSKK 
EMFSDEFSFLSSVTDRKFGVCLGVSSLYFSLSQRLDLPLEAVTPPGH I YLRYQGGEVNI E 
TTAGGRHLPTASYCDCLDLEDLC3WTPEEMIGLTFMNQGSFAUJKKKYKEAEEAYKKAQE 
YLGDEELQELLGFVQ I LGGKKKEGKSLIGKSPRASQKGSVAYDYLKGRINI PTLALLFSY 
PG SNYEE I ASYE EELKKAMKS SM PCC EGQRRLASVAFHLGK*TAEAVALLEKCVED I PNDL 
SLHLRLCKI LCDRHEYTKALKY F 1 1 AERLMEDQGFLICKDNRSFALFYEVKKI I SKVAPQK 
ANTLLLMESER 

CPn_0146 182595 183095 

No robust homo log present m Genebank/EMBL as of 11/7/98 

1 1 VG I SMS S SEWFQTVHGLG FGGLS S K SWP FKKS LS DAP RWC S I LVLT LGLGALVCG 

lAITCWCVPGVIU^ICAIVLGAISLALSLF^WGLFSN^ 

GQ FS RAAPSGMGL PG DG S P RAST PSC LEELCAE I QAVTQA I DQMS DD 

C?fe0147 183213 183671 

N<5 Robust homolog present in Genebank/EMBL as of 11/7/98 
HGGPKAVQS I KEAVTSAAT SVGCVNC S REAI PAFNTEERAT S IARSVIAAI IAWAISLL 
GL^VVIAGCCPLGMAAGAITMLLGVALLAWAILI PSPGKNGEPNERN 
SATBPLEGGVAGEAGRGGGSPLTQLDLNSGAGS 

C430148 183822 185702 

. pk-hj-S/T Protein Kinase 

Gtfo^VSSMESEKDIGAKFLGDYlULYRKGQSLWSEDIJ^^^ 

SQ^MEAFHDVVVKLAKLNHPGILS I ENVSESEGRC FLVTQEQDrPILSLTQYLKSIPRK 

LTS^EIVDIVSQIJ^LLDYVHSEGIAQEEWNLDS^ 

IDI^FISDEENRE'SKIKERVIXHTSEGKGGREDT^ 

FgMl YDWDFLI SSCLSCFMEERAKELFPL IRKKTLGEELQNWTNC I ESSLREVPDPLE 

SSQNLPQAVX,KVGETKVSHQOKESAEHLEFVLVEACSIDEAMDTAI£SESSSGVEEEGYS 

LXLQSLLVREPWSRYVEAEKEEPKPQP ILTEMVLI EGGEFSRGSVEGQRDELPVHKVI l 

HdfeELDVHPVTNEQFIRYLECCGSEQDKYYNELIRLRDSRIQRRSGRLVrEPGYAKHPVV 

GVTWYGASGYAJiWIGKRLPTEAEWEIAASGGVAAI^YPCGEEIEKSRAN^ 

Yp^PYGLYDMAGNVYEWCQDWYGYDFYE I SAQEPES PQG PAQGVYRVL RGGCWK S LKD D 

LR-pAHRHRNNPGAVNSTYGFRCAKNIN 

CPn^0149 ' 185706 187700 

dnrtef-DNA Ligase 

ERF^EENSQAHYLALCRELEDHDYSYYVLHRPR I SDYEYDMKLRKLLEIERSHPEWKVL 
WSlPSTRLGDRPSGT FS WS HKE PMLS I AN S Y S KEELSEFFS RVEKS LGT S PRYTVELKI D 
G IAVA I RY EDRVLVQALSRGNGKCG ED IT SNI RT I RSLPLRLPEDAPEF I EVRGEVFFSY 
ST'F&I INEKQQQLEKT I FANPRNAAGGTLKLLS PQEVAKRKLEI S I YNLI APGDNDSHYE 
NLQRCLEWGFPVSGKPRLCSTPEEVI SVLKTI ETERASLPME IDGAVI KVDSLASQRVLG 
ATGKHY RWALAY KYAP EEAETLLED I LVQVGRTGVLTPVAKLT PVLLSGS LVSRASLYNE 
DE IHRKDI R IGDTVCVAKGGEVI PKWRVCREKRPEGSEVWNMPEFC PVCHSHWREEDR 
VS VRCVNPECVAGAIEKI RFFVGRGALN I DHLGVKV ITKLFELGLVHTCAOLFQLTTEDL 
MQ IPG I RERSARNILES IEQAKHVDLDRFLVALG I PLIGIGVATVLAGHFETLDRVISAT 
FEELLSLEGIGEKVAHAIAEYFSDSTHLNEIKKMQDLGVCISPYHKSGSTCFGKAFVITG 
TL EG MS RLDAETA I RNCGG KVG S SVS KCTDYVVMGNNPG S KL EKARKLGVS I LDQEAFTN 
LIHLE 

CPn_0150 197759 192444 

CT147 hypothetical protein 

I I YYKFFYS YNCPYFI SFFVLLGVNMASSSNNSTKQDG I P S WVN PNVQWNRA3Q VG DQ EA 
NSLTPEAQTSRSWFSDRKHF LEVLDVSLEEMENNDLKKYSRYKTI I LIATLVTVAITCIV 
PISMVFGIPMWVPCLILFGAGLSSAFLSHRLOSKCKEIHLRYRAYQIYRQQLLSQYPDLR 
K3TLYKYSTTHVKPKKGFVGKLVENLRPDLHKNKDDGGAAADSRLDFAGYGVKHYQTDAL 
LG VSG VNS V EWQR LAS L I MSVKND I LNDVGS R E P I D KAQ R S ALWSGKD I GG E IQ PGG I L 
DI SRDI LAICGYGMWGVEAKKAIDQYKKWYLNSSTFIAWNPQLPAIAQSYLLEQQRHLD 
YAAKIFQDL5ALTTAHCTGQALEDLDSLLCYYDQLI ESKGVGEKI IAS I HQKHLDLAMQD 
:X1jOEHLKKWSNLYHVFSITIKEFTEGKLEONEVVSRIQRLRGKLEKSKCSILGNCRTNA 
EYATK3EKKLADYLLQIGDREPFLTGMHKAIATGKAIQGKVEGVISQHPEKQIMMLRCSI 
EPLEGMLRREDWGA r loknedevlalkstmeaqlqofkdlvgtwegkyoefkknklskvl 

V-fDFTK^YSNLLNRLEVLHAEGSTDDLVLHVDRMSEDLKKTIEEIDCNLFQVTPEELSLL 
APKYO(.;i,MNELrLI VQEGNRL0EAIC3EGVSQGLMLLI I3LLNRDEK INKNIE33RKNLVA 
r AKQAR: :DARN I DMOGLAPLIQRNRACLDN I LONMY LFMG3 I RN IHALDTETLVAT3SNM 
K.^AWffTFDWNIYTNLLDVLEIQSKPAPAPMENrDLP^ALPEEVODAVAEDV^GTHRLHHQ 
VI YU Ri ' ADr.KNM [ : IQLOKS ENKWGMAKA I VLC I VAYLFCVL: u\ I F IGQN t L3LLILSCVG 
1 ,L.LTW .'['[. r FDR I 3KSKEFEKQVLCTAQSL I PATK ILPSEFNNKDLNRLAKLQDNLNLE 
« IK f U'TWAKN L VSDI .EC, I rTKEK.';LKDLTKCFRKD ~KI ILNKR I KRRFKEGUjQEAPWRPT 

[L'<jl>[Ri;al-:vfaklurcll:hlokokeci jircdxlyoepmclclek^kydnekahaaamt 
k kt, klon l i )h lok nn lt y vr i qn f r r' r l 1 0 e k lg p dtvq e i d w kcak elhelaa 1 1 yg 
n'ttxjk: vk^kakkofkcnvlh lagkgqlellcayljr/tagc* '^crh^mqasfrerillnp 
uiakhcl-'af.rti .a: irkkmlktlolsyltpfvr f^pfttq 1 -gym j i lkvreqlfdi eqrl 

(jNOKTV: :\ 'El >Y AAVOA ALAAYVR Ki i E.I L [ V3T YG L/JA r jD lOTlJSKVTTLMRDLHAVEELV 
KM f ;VI;M'YRi,NR: :tV f LHRVH. f ;VLi IStlLRDCDfl^GMf; [ IDWKKLFLLLNNNC^PNDPEC 
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OKYMQILLOAPVSLLYGAFKGFKNEFLl^Ni^ELN EANSTKAAEEEAKRYVEEKGRGFETY 
WEEAKQRLEAIAAELDDLRNOETLLECE I PLANLK 1 3 1 F SD LNLR EKVSVEKAALE EE I Q 
GIQEQYAEMQGI EDLELKQKFEDLCKKLEALEERLLQ IGRR I OSSVDKQKELLGLLCREE 
AA 

CPn_0151 194178 l?2h2S 

tnhp A - Monooxygena s e 

CYE^FHYPRASMADILVIGANFTCLIL^MLrQHGIGVKVtCHRASPEDPSFLDCRKLP 

\ rrT . ~r ~: :.M*jTf': ~" r w n " " — ^"'^y^ r ?L:rv t: -"c^'" '^^^^lgttyc 1 
ii.F'jHL rr, ~" KP f ■ * ' w ■ "i T> >. : v -ON! -::ip -v-r.' u '\\ z - \ :2ArN 

NLDIRDLVKowLRAKK LNrttVi r L NC L ^ucr ELLh I HLLP ITKtit LNE- VF t NP^EKTKC 
LCLPQGTHS ISPKLKCKLLYTYNLVISDENFH IKTSHHAFPPEHGNVLFLGSLSNTLLLS 
YLNGINTN I HAAFNLAWKLLPVLKKAALKHLVITKEQEDGNILPY I SPTTEKRAKKLPFS 
RFYTPALMrrFLKGCRKFNTTGEEYYYPPHQALKYRSSD I IKMSPQDKEI HG PGPGHRAI 
DARLENGSFLIXJPLKSSK^IXIFFKDIPDLKEAUJEH^GEWIEICNVKEPRILNLYHANP 
NSLFI IRPDRYIGYRTHTFKLHELISYLLR I FAS EKTS 

CPn_0152 195274 194318 

CT149 hypothetical protein 

L I KMRKVAFLVSCLFSVA IGAS AAPVRVPGF PQ I P EDLVQI KTEVC PKQEVCLAVT IKCD 

DHNLIGVIiiLPOTPTPEGGFPTWLFHGFRGTKFGGLTGAYRKLGRKFAAVGIATLR\mM 

AGCGDS EGVAEEVP I ETYLRDAGT I L ETVQEH PDLNAYRLG I SGFSLGCH IAFELAKIYN 

PRDLNI?^SVWAPIADGGILLKELYENFSKHGEGDriSVGKDFGFGPPPIIVCSGDVDL 

LIRrQDHVTANSLPTKPYII^C^IDDTLVSRTOTLFKNTAPGROTFISYPm^HNIA 

APDLDMILDQIVSHFQRTL 

CPn_0153 195430 197892 

leuS-Leucyl tRNA Synthetase 

NMRYDPNL I EKKWQQ FWKEHRS FQ AN EDE DKVKYYVLDM F P Y PSG AG L HVGH L I GYTAT D 
IVARYKRARGFSVLH PMGWDS FGL PAEO Y A I RTGT H P KVTTQ I ANF KKQL S AMG FSY D 
EGREFATSDPDYYHOTQKLFLFLYDQGLAYMADMAVNYCPELGTVLSNEE 
YPVERi<MLRGWILKITAYADKIJ J E£IJDAL EGALVT FHLTQEG S 

LEAFTTRLE3TLLGVSFLVIAPEHPDIJ3SIVSEEQRDEVTAWQE 

TGVFTGNYAKHPITGNLLPVWI SDYWLGYGTGWMGVPAHDERDREFAEMFSLP I HEVI 
DDNGVCIHSNYTIDFCL^LSGQEAKDWIKYLEMRSLGRAi^^ 

IPIIHFEDGTHRPLEDDELPLLPPNIDDYRPEGFGCX3PLAiCAQI>A/HIYDEKTGRPGCRE 
TYTMPQWAG SCWYYLRFCDAHN S Q LPWS KEKESYWM P VD LY IGGAEHAVLHLLYSRFWHR 
VFYDAGLVSTPEPFKKLINCGLVLASSYRI PGKGYVS I EDVR EENGTWI STCGEIVEVRQ 
EKMS KS KLNGVD PQVL I EEYG ADALRMYAMFSG PLDKNKTO S NEGVGGC RRF LNRFYDL V 
TSSEVQDIEDRDGLVLAHKLVFRITEH I EKMSLNTI PSSFMEFLNDFSKLPVYSKRALSM 
AVRVLEPIAPHISEELWVILGNPPGIDQAAWPOIDESYLVAQTVTFWQV^KIJKIPXEV 
AKEAPKEEVLSLSRSWAKYLENAQIRKEIYVPNKLVNFVL 

CPn_0154 197874 199202 

gseA-KDO Transferase 

T S EFC PMMLRGVHR I F KC FYDWLVC AFV I AL PKLLYKMLVYGKYKKS LAVRFGLKKPHV 

PGEGPLVWHGASVGEVRLIiPVLEKFCEEFPGWRCLVTSCTEZiGVQVASOvriPf^T^ 

SILPIJDFSIIIKSWAJOJiPSLVVFSEGDCWI^FIEEAKRIGATTLVINGRISIDSSKRF 

KFUCRLGKNYFSPVTCFIiQDEVQKQRFLSLGIPEHKI^VTGNIKTWA^ 

WRDRLRLPTDSKLVI LGSMHRS DAGKWLPWQKL IKEGVSVLWVPRHVEKTKDVEESLHR 

LHIPYGLWSRGANFSYVPVWVDErGI^KQLWAGDLAFV 

VPLIFGPHITSQSEIAQRLLLSGAGLCLDEIEPrirnVSFrJ^KJEWEAWQKGKVFVK 
AETAS FDRTWRALKS Y I PLYKNS 

CPn_0155 199697 199488 

No robust homolog present in Genebank/EMBL as of 11/7/98 

NS LSFGVPFLEKLK I S LI P I EEMRHELFMKTHNSS SNGFSNQ EKG I RTYFKSDLLG YEDL 

YFLRENIKPN 

CPn_0156 200147 199770 

No robust homolog present m Genebank/EMBL as of 11/7/98 
LGKQKLLARHMMHNIWLSEEPGRSAFLGRTAFFPNKYP I AQGGVG I PSTIGNLFTIWYC 
FYFYRAATPQSDHPDGCGFILLERIJtEIXSAGFFTCDLRESNTTGFTLFFEGSNKGVLKNH 
LFIRDE 

CPn_0157 200753 200298 

No robust homolog present m Genebank/EMBL as of 11/7/98 
FS FVTYKEALMN I YQF S PGAS PNWQAS LMAQLNSYFCLGG ETVTR 1 1 SLRPSGL I LAKKE 
KAWSTAEKILKILSF ILFPLVLIALAIRYLLYNKFNKDLDRAVFFIPTEITKAEELI I A 
KNPALVKEAALTVSPLFYSLPKKYQLMKVETP 

CPn_0158 201463 200894 

No robust homolog present in Genebank/EMBL as of 11/7/98 
PPKITLSINIDLLLEDLDTDSI PWPKLYLSEDFDFAYYPESKAI IDTVAKLEKNNPGEEF 
CLESKKILARYLLEQLFKLETGLNFPTSTIDGGRESFLIEFSHETKKPTVWAFTYFYYYH 
SNGPKLEKDFKQAGCEVHNRLLNLGLK'i'R PQAGAQNDGRNGG PYGP IGFLI VWEENYGSV 
LKDHGFIKDN 

CPn_0159 201811 201467 

No robust homolog present in Genebank/EMBL as of 11/7/98 
CCFGGETATRIFSMTPSGFSLATEEfr/QVSTAEKVIKILALIFFPIILIALAIRYFLHRK 
FDRKCFVI PCDTPKELELI LAANPQLVEKAAREVH PGFFALPTKYQSMYIOTSKG 

CPn_0160 203794 202127 

pf kA-Fructose-6-P Phosphotransferase 

TVELLSLNKSYFEI0RLRYRPEILTLLETIRSKHIQETS3PPSPPPELQKHI PNLCRIPE 
V3 IYTEOETSSKPLKIGVLLSGGQAPGOHNW IGLFDALRVFNPKTRLFGFIKGPLGLTR 
GLYKDLD I SV I YDYYNMGGFDMLSS3P EK I KTEEQKKN I LNTVKQLKLDGLL I IGGNNSN 
TDTAMLAE\'FLAHNCKT3V [GVPKT I LTJDLKNCW r ET3LGFHTSCRTYSEM IGNLAKDAL 
3 A KK YH Hr I R LMGCOAG YTT L EGG LOT L PN IALIGELI AT RK 1 :3 L K 0 LS EOL ALG LVRR Y 
K3GK>r^3TVLIPEGLrEHTFDTPKLILELNVL[ ) ANGD.';3[EKILi>KL3PETLKTFHLFPK 
DtANQLL[,ARD3HGNVRVSK[ATEELLAVMVKKF:rCKtKPHMErH^V;;HFFnYEAR5LGFP 
3NFDTNYG I \LGI ISALFLVRQFTOYM IT INN[,AOi*YTEWCXjGATP[.YKMMliLENRCGTE 
TPVT FT DSVP P K 3 PAVOt ir.LOtjfJDfX L'/EPLYRrPGPLOYn iKEF.L f DQRPLTLLW ENQT 
HSPFOALYtfTdGKRSL 

f:F j n_0lhl .'(HhOH SQM'iH 

(pr "fiLCttM acy Lt i.irn^; r* i.i's. t-uiuly) 

HR:;. r ;T3RK0EPLLITi JWLLKHEQRTMI , :;r;rLLNNr'rPF''U.Li 1TPLI IYNPPYP ! V [ LLHG 

la:;[^ktgskr:;hvr:aueltrl/; e aalrvim ,i ,t ;i igdcet ji'LjMDFiii.knykijn c pe ii eyt 

WrA.Ul l DOER LA r R^JiJU :GTLALOTf A'FFNK L KALAVWAIT li'A -.KlUAAFAOFNAPEV I 
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TMSOKCAITYAGMTLNPDFVTOFLKIDIVKELMPSARNLPPILYMQGEQDLLVSINHRTL 
FTEAFANQDKP IT ILTYPDVDHAFPFAESSALSDLTQWLKRELTSGE 

CPn_ULb2 205870 204803 

No robust homolog present m Genebank/EMBL as of 11/7/98 
FVYTLYNIQSPFRIMKLYS ISSDVDTPWIFQLMSKVDSYLFLGGNRIKWSIVMQEPNLI 
[GKVENVR T3TI VKILKI LSFL rFPLILIALALHYFLHAKYANHLLVSKILERAPQYVPI 
fX'R^GDTASHYKLTTLVPVSOKNI^AMGSNPLEVEAALRTTKPSFFCVPAKYRQI I ISSH 

■ .\<r.-\.: i.::..;.,^, *:n'lo/. wit ■•i.;r:TMDt -' \dkp\ iwjiLrTrrrvztr. 1 *'".?? 

" .: i-'ML.O ! iLF T P" ' !' r 0 HUT-XL" ' H . 1"' c tPi.T' .:' 'VR f : :* "I [FTf ONPT rwrOVFFPQO 
PLDEDRGGGFEILEQI^ELGVRFPICPSOGPDNPNFQGFQGrRrYWEDSYQPNKEV 

CPn_0163 205831 206394 

No robust homolog present in Genebank/EMBL as of 11/7/98 
FEKAIVYCIKCKQI IKCIS I IHTPTPATPLCTEGEI FPGPVDSAIQNDLERLLTVKKRPD 
1 1 REYLRAGG S L VTTY P KEGQ RLR S P EQL RVLDDLVQSY PNHLHAI ELDCGA I PQDL IG A 
TYI ITFADFSTYI LSLRSYQANS PSDDTWGIWFGS IDDPVQAVI S F LKDHG F AL PSTLAQ 
DPLLCTNK 

CPn_0164 206444 206998 

No robust homolog present in Genebank/EMBL as of 11/7/98 
LCFKC I YIKI I FSFLKQLMTRST I ESSDSLCSRSFSQKLSVQTLKNLCESRLMKITSLVI 
AFLTLI VGGALI ALAGGGVLSF PLGLI LGSVLVLFSS IYLVSCCKFFTLKEMTMTCSVKS 
'KINIWFEKQRNKDIEKALENPDLFGE^JKRNVGNRSAF^^ 
MYFYL 

CPn_0165 206983 207582 

No robust homolog present m Genebank/EMBL as of 11/7/98 
^^/LI^FM^^WPKTIDHVT)PESEIDIRKVVSCYKLIKECQPEF^ 

RSKYQ EQARTVSDEDAPLFCLTRSYYQDGYLT PLRAG PRDL INHYI HLRRRENPKHF FSP 

KHPCYYARLAFNESVCVYRELFDIERLTKMYVEGDYSKEQEK^ 

L I EHKDTDL IGRGFTDVFCT 

CPn_0166 207594 207962 

No robust homolog present in Genebank/EMBL as of 11/7/98 
NCIJ<QY^SDSIMSESINRSIHLEASTPFF1KLTNLCESRLVKITSLVI SLLALVGAGVT 
LWLFVAG I LPLLPVL I LE 1 1 L ITVLVLLFCLVLEPYL I EKPSKIKELPKVDELSWETD 
STL 

CB%.0167 208309 207977 

NciL-fobust homolog present m Genebank/EMBL as of 11/7/98 
NLWj? H F PRG FFMLP FC PT I LLAK P FLNS ENYG LERIiAATVDSYFDLGQSQ IVFLS KQ DQG 
I TASSELS AKDRKFKPGSMNCTLYT ED P I L PAHNS FSNCS D I OMRTPI SP IH 

C£&z0168 208716 208417 

Nqj Robust homolog present in Genebank/EMBL as of 11/7/98 

S YI^RRRENPEHFFNPGHPCYYARLAFNESVR IYPJCLFNTAELKQMYGAGDYEQQNEDN 

U^LSFVQ ILDEKDGFDDFLATHKDTTFIGRGGADI FCS 

cJni:0169 209537 208710 

NoW.obust homolog present m Genebank/EMBL as of 11/7/98 
SE^jEFTIGENNMKNVGSECSQPLVMELOTQPLRN^ 

TALftGAG ILSFLPWLVLGIVLWLCALFLLFSYKFC P IKELGWYNTDSQIHQWFQKQRN 
KD£EKATENPELFGENRAEDNNFJ3ARSQVKET1^^ 

TMPDVDPVS EDS IRTVI SCYKL IKACKPEFRSLI SELLRAMQSGLGLLSRCSRYQERAKT 
VSHKDAPLFC PTHSYYRDGYLT PLRAG P RY I INRAI 

CP?C0170 211098 210025 

Nc^rfEpbust homolog present in Genebank/EMBL as of 11/7/98 
NVjkKNH 1 1 RGEKYNTCTV I AFVLSMSYDT LF KNLEKEDS VHK I CNE I FALVPRLNT IACT 
EA;jI:iEKNLPKADIHVHLPGTITPQIAWI 

FRNFQD I CHEKDPDLSVLQYNI LNYDFNS FDRVMATVCGHR F PPGG IQNEEDLLL X FNNY 
LCj&rLDDT I VYTEVQQNIRLAHVLYPSLPEKHARMKFYQ ILYRASQTFSKHG ITLRFLNC 
FNKsEFA I NTQ EPAQ EAVQWLQEVDST F PGLFVG I QSAGS ES APGACPKRLASGYRNAY 
DS^CEAHAGEGIETRTIFSSAKVNPEGLIEITRVTFSSLKRKQPSSLPIRVTCQLG 

CP^ybl71 212444 211149 

*guaA-GMP Synthase 

1 1 KLQSARRHLNT I FILDFGSQYTYVLAKQVRKLFVYCEVLPWNISVCCUCERAPLGIIL 
SGGPHSVYENKA PHLD PEI YKLG IPX LA I CYGMQLMARDFGGTVS PGVGE FGYTP I HLYP 
CELFKH IVDCESLDTE IRMSHRDHVTT I PEGFNVI ASTSQCS I SGI ENTKQRLYGLQFH P 
EVSDSTPTGNKI LETFVQEICSAPTLWNPLYIQQDLVSKIQDTVI EVFDEVAQSLDVQWL 
AQGTI YSDVI ES SRSGHAS EVIKSHHNVGGLPKNLKLKLVEPLRYLFKDEVR ILGEALGL 
SSYLLDRHPFPGPGLTIRVIGEILPEYLAILRRADLIFIEELRKAKLYDKISQAFALFLP 
IKSVSVKGDCRSYGYTIALRAVESTDFMTGRWAYLPCDVLSSCSSRIINEIPEVSRWYD 
ISDKPPATIEWE 

CPn_0172 213237 212440 

* impD-Inosine 5 * -monophosphase dehydrogenase (COOH-terminal 
region only) 

AP IGAAIG IGPLG 1 3RAHHLVEAGANVLVI DTAHAHSKGVFQTVtEI KSQFPQISLWGN 
LVTAEAAVSLAE IGVDAVKVG IGPGS ICTTR I VSGVGYPQ I TA I TNVAKALKNS AVTV I A 
DG R I RY SGD WKALAAGADCVMLGS LLAGTD EA PGD I VS I DEKLFK RYRGMG SLGAMKQG 
3ADRYFQTQGQKKLVPGGVEGLVAYKGSVHDVLYQILGG I RSGMGYVGAETLKDLKTKAS 
FVRITESCRAESHIHNIYKVQPTLNY 

CPn_0173 214041 213715 

No roburst homo Log present in Genebank/EMBL as ot 11/7/98 

T I FDL I YKI DSYKHQQGFMDFSVFPDRFVESTS PS PI ED I DAKTLVSNCCHYCSRCLFI F 

[..'JLLSI r I C Ft'VYGTSG ET AS LV FG I LS L I VLVL L I IECRNRECCRRIS 

O'n_0174 214215 214724 

No lobust homo lot? present in Oenebunk/EMEL .i-j ot 11/7/08 
Y L F LWF7EK I V I L.'jM IMTT I SNS PJ PALM PELS LI pp PT LVS.^TTQrJLAYT I PAQCRRS 
TLR HLhlFl I r LGLATI ICTFIVIFFLNGLNLLGTP^ [ ICnrJCLI I VGLLFLIMGLYFM 
ir^LDC/JLVGLLQKELCOAEEREEEYIOCrEALRGAPRACCPTEiJPSTWL 

<:Pn_ni7 r , iUH'jh .H5275 

rio robunr homo Loq present in < ;en^b,uik/ EMBL .v. or lL/7/ng 
LIJ J ACFOFr.L.KI<PDMEOPNCVrODTTTVLYALNSror'RLCDDTHRLGKQOPLEA£NALGE 
F f R" 1 1 , 17 I'M.' T F P [ , EE VA f P I LPGY! I PKFYLSF I DRDIXjGVHYFA/LDGVFLKTVAACI I ENS 



FLTDSMSPELLSEVKEALKP 

CPn_017*i 21527L 21^518 

CT153 hypothetical protein 

^roDDPMDESIX3EEASKDSAFSASFSYEFVK3STRESK^^rVT^i OTAS RTL Y I LRODCSYDP 
RALKVD D E F R YWEKR LDAKN P DS LN AFVK EVGT H Y V AS VT YGG I G FQV LKM S YLQVE E L 
EKEKISISVAAASSLLKSKTSNATEKGYSSYQSESSAGTVFUX7TVLPDLG0DKLDFKDW 
S E3 1 PNEP I PLA I SVS SITDLIIPELFPS EDAQVLSQKK S ALCQV I LNY LES HK PKEEG P 
, , m ^p,,,.,-. n ,. - . ; ,:^ P i™ r rp7yp>".r" r ^v» ^r-*',K r " T *"'"AOPLGFYLRF 

- t r , .j, :N t .vurr ' : : .a.; ~ • ; ' pypa; . ' ! . w; va ( < ,ykdrctwt 

LEKLNTTGDLr [RJGDEIPLKHNTo'jPii UMTJMJUJ : L'CTrtfroU^ V F I IT/ 
CPn_0177 217513 216608 

No robust homolog present m Genebank/EMBL as of 11/7/98 

dkrec/tkskfifliseesmkopmslifssvclglglgslsscnqkpswnyhntstseeff 

VHGNKSVSQLPHYPSAFRTTQr FSEEHNDPYWAKTDEESRKIWRE IHKNLKIKGSYI PI 

STYGSLMHPKSAALTLKTYRPH P IWINGYERS FN IDTGKYLKNGSRRRTSHDGPKNRAVL 

NLIKSSGRRCNAIGLEMTEEDFVIARRREGVYSLYPVEVCSYPQGNPFVIAYAWIADESA 

CSKEVTLPVKGYYSLVWESVSSSDSI^'AFGDSFAEDYLRSTFl^XTTSILCV^ 

QP 

CPn_0178 218052 217789 

No robust homolog present in Genebank/EMBL as of 11/7/98 
VKE^DFLVQRNVERDPCTKRHCWSQKFGGESIDAKTITGQLFHIAGKTEPGHGKLCIjG 
ES I LKQLLALGI ITGYENREREVWVTLD 

CPn_0179 218550 218056 

No robust homolog present in Genebank/EMBL as of 11/7/98 
PKIWDTHFETRIEATSVPKFNRRIJIKSFHKSGRSSRPSKA'CVANFFNFTLOAGRSGI I PG 
KKAIIXNVNIJAKTPNYSCIFESIGFFNEQDLEAQHNG^AALVTIKILKW 
LPRSLKKDRKFMSSLIFTKLSYALDLSAPMHLEGKPNLSYEEKLD 

CPn_0180 218963 218355 

No robust homolog present in Genebank/EMBL as of 11/7/98 

TS LHKI LDCKYKPVF IQNTVASETYPSQ I LHAQREVRDAY FNQADCHPARANQ I LEAKK I 

CLLDVYHT>JHYSVFTFCVDNTPNLRFTFVSSKNNEM^ 

LAACKIRNIEVPRVVGLDLRSGILISKLEUCQFCFTJSLTEDFVbmSTNQEEA^ 

LISLILLCKQAVLESFQEKKRSS 

CPn_0181 219175 218777 

No robust homolog present in Genebank/EMBL as of 11/7/98 
VH£LFKIDGVYYFFKKF>JKLFYNNYSLNSHHEKPSSLEKAVO 

DD ISREIYCVRRLYIRFWIVS ISQSLSRI PWRLKRI LLRYCTLRGKYVMPILIKRI AILL 
GL I RFSRLRKSVY 

CPn_0182 220704 219334 

accC-Biotin Carboxylase 

RCIMKKVTilANRGEIAVRIIRACHDLGLSTVAVYSLADQEALHVLLADEAICIGEPQAAK 
SYLKISNILAACEITGADAVHPGYGFLSENANFASICESCGLTFIGPSSESIAMMGDKIA 
AKSLAKKIKCPVI PGSEGI IEDESEGLKI AEK IGFP I VIKAVAGGGGRG IRIVKEKDEFY 
RAFSAARAEAEAGFNNPNVYI EKFIENPRHLE I QV IGDTHGNYVHLG ERECT IQRRROKL 
IEETPSPILNAEIRVKVGKVAVDLARSAGYFSVGTVEFLI^KDKKFYFWE^ 
ITEEVTGIDLVKEQIHVAMGNKLPWKQKNIEFSGHIIC^RINAEDPTNNFSPSPGRLDYY 
LP PAG PS I RVDGACYSGYAI PPYYDSMI AKVT AKGKNREEAI AIMKRALKEFH IGGVQST 
I PFHQFMLDNPKFLESNYDINY IDNLLAQGNSFFKEF 

CPn_0183 221207 220695 

accB-Biotm Carboxyl Carrier Protein 

RRLGMDLKQIEKI^IAMGRNGMKRFAIKREGLELELEKDTREGNRQEPVFYDSRLFSGFS 
QERPIPTDPKKDTIKETTTENSETSTTTSSGDFISSPLVGTFYGSPAPDSPSFVKPGDIV 
SEDTIVCIVEAMKVMNEVKAGMSGRVLEVLITNGDPVQFGSKLFRIAKDAS 

CPn_0184 221814 221221 

efp- Elongation Factor P 

QWK I KFCCCEEK IMVL SSQ LSVGMF I STKDGLYKVT SVS KVAG PKG ES F I KVALQAADSD 
W I ERNFKATQ EVKEAQFETRT LEYLYLEDESYLFLDLGNY EKLFI PQ E IMKDNFLFLKA 
GVTVS AMVYDNWFSVELPH FL ELMVS KT DF PGDSL SLSGGVKKALLETG I EVMVP PFVE 
IGDVI KI DTRTCEYIQRV 

CPn_0185 222457 221765 

rpe/araD-Ribulose-P Epimerase 

A EVKKQ ESVLVG PS IMGADLTC LGVEAKKL EQAGSDF I H I D I MDGH FVPNLT FGPG 1 1 AA 
INRSTDLFLEVHAMIYNPFEFIESFVRSGADRIIVHFEASEDIKELLSYIKKCGVQAGLA 
FS PDTS I EFLPSFLPFCDWVLMSVYPGFTGOSFLPNTI EKIAFARHAIKTLGLKDSCLI 
EVDGG I DQQSAPLCRDAGADILVTASYLFEADSLAMEDKI LLLRGENYGVK 

CPn_0186 222878 224068 

•similarity to Cps IncA 

PI KDKILMSSPVNNTPSAPNIPIPAPTTPG I PTTKPRSSFIEKVI IVAKYILFAIAATSG 
ALGTILGLSGALTPGIGIALLVIFFVSMVLLGLILKDSISGGEERRLREEVSRFTSENOR 
LTV ITTTLETEVKDLKAAKDQLTLEI EAFRNENGNLKTTAEDLEEQVSKLSEQLEALERI 
NQLI0ANAGDA0EISSELKKLI3GWDSKWEQINTSIQALKVLLGQEWQEAGTHVKAMQ 
EQ IQALQAEI LGMHNQSTALQKSVENLLVQDQALTRWGELLESENKLSQACSALROEI E 
KLAQHETSLQQR I DAMLAQEQNLAEQVTALEKMKQEAQKAES EF I ACVRDRT FGRRETP P 
PTTPWEGDESQEEDEGGTP PVSQ PSS PVDRATGDGQ 

CPn_0187 224213 225045 

predicted methylase 

VFLTYTRTLPMHSKFLSRRKKNSSHKEETnWCWIASSYfJKIVODKGHYYHRETILPOLLP 
;>LTLGjK5SVLD IGCGQGFLEPALPKECRYLGIDI.J.CPL lALAKKMR^VTJSHC'FKVADLS 
KRLEFVEPTLFSHAVAILSLONMEFPOFArRNTATLLEPLGOFFIVCiNHPt'FR IPRASSW 
HYDErJKKAISRHIDRYLSFMKIPlMAHrf^OKD'JP.'JTLrJFHFPL.T/WFKCLCSHnrLVSGL 
EEWT'jr)KT'jTnKRAKAENLCRKEFPL,i , l »M I 'X [KIK 

Crn_0LRH 5^5090 .>.>i. A Oh 

CT1 J2 hypothorifMl ptorr-Lti 

KT[-M:X'IMFRKLFPF:;KKKTG0K0RLRNN(1[J.0AL [O'HKVLLHNKA.'IKE-'.ArVL.^YYOLL 
■h "VI- r LVFFLRL^OI ILFTNLMWKEWI. [ I KFI-OYKK V 1 7A I VEAAYIl ATKNN I^LVLVt^F 
(-VKC WACULMLLJLEDf JLNKIFPT^WrF'L.:i.KI'r.V::YI"/r , PLV.';i'M[Fr IV( -f;::w[YETQ 
IMP [OYAKLF^'LliHSMTALYF I.'JRFVi-'Yr.LLYLAlJY,'' YAFLPPVA ruKTUALISTLT [C 

. :vw lvfokaff:;lcv:; [ FNY^FTYGAi.vALr":rr .li.i. / r ytm r vur-v x;altk r ionrcxt 
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FtFLGDK E LP3CYLQL rT3TY Z LALTTPQFNEGL3PLTAQF IAKQ3KVP IGEVSQCLDVL 
EKEGFLFPYNNGYQPVFNFS ELT I KD I ADKLLHRE I FKKFNPDLG I TF I ENS FQN I FNQA 
3 KNKENLTL J E I ARR I K 

CPn_01H9 226391 229825 

CT131 homo log- (Possible Transmembrane Protein) 
ANQMKRRSWLKILG ICLGSS IVLGFLIFLPQLLSTESRKYLVFSLI HKE3GLSCSAEELK 
tSWFGRQTARK tKLTOEAKDEVFSAEKFELDGSLLRLLIYKKPKGITLGGWSLKINEPAS 

;: iii-:"/::iU,i'i ..i ' 'i * ' . • mk - ' * • \:\. "/ . v tvtwti 

.'\-A/jv.* :nr i-t ../" - nrF-" ' v rr-wrhL- ■ l : [^l.V'^.\th 

TNDGKIS^ASGEGNQIQMKLQGHIHKSTFY IVEGJSSFIELKPELASALCNQI IPLSTP 
I TSKQ I HATVSYAKIPLDITKWKH I E ITSQAQLPEVAIH PKDPNLALQLRDTKLC I KKTE 
KF3DI RYSSSTVLGGASPSHLNGL I S I I^IKKHLTKFRLQQAQLPHTYLRAI FPQPFVINV 
PL DVAYYS LNI EGTYKNAHLEADAI LDN P LLKLSCSMSGTWKNFLFKGQGTYH FNKKWQE 
I LS PHF S Y AEARFSGKAQ ITDTNLFF PKF SGK ITARENELL I HAKFGS PNEP I KPETT S I 
L I HGQFCSLPLSLVSNHI^PFHLKKLTFSFHTDGGKFVTKGNLQAL I ENPDYPDLNNTRI 
LI PDLLLSLDESSTSPSSKDLKIQGSGEI FSLPLDS ITKTYGKQVRLSPYFGSSGDLNFV 
VNYN PKDQNKLTLLSNFKSEALLGELKLVMDF SMKLS SGTQGTLCWEVS PER YAS F FKNA 
SCSPTCLLHRTANVRLDISKLSCPEETKGLSCLTLiAAGGLEGSLEATPLIFYDNVSKET 
FIINDFKGSLJWn^LDAKIEYDLKGSCLAPRQDSKTLAEFSLEGQVDHLFSPESREFKQT 
A^IHIPSSFIAGIIPMSPGLKAQISSLAGPRINVSIKNAFRFGEGPVDIMVDSEM.QAQ 
IPLILNEKSIIiRENLTAHLSINEDVNKAFl^EFNPLLAGGAYSQYPVTLEIDKQNFYLP 
IRPYSFEEFRIQSATLIDMGKISIAhrnrr>rYALFQFLDITDQKQFVESWFTPIFFSVQKGS 
IICKPJ^DALIDRRIRIJ^LWGKTDIAHDRLF^I^IDPEVIKKYFH^SLKTKNFFLIKIR 
GSISSPEVDWSSAYARIALLKSYSLGNPFSSLADKLFSSLX3DSTPPPTVHPFPWEKSNFD 
SIENK 

CPn_0190 229901 231274 

No robust homo log present in Genebank/EMBL as of 11/7/98 
LLGIKIJ^IRKRHSFDSTSTKKEAVSKAIQKIIKIMETTDPSLM^ETP^IAEIESILQEIKEI 
KQKLS KQAEDLG LLEK YC SQETLSNL ENTNASLKLS I GS VI EELAS LKQLVEES IEESLG 
QQDQLIQSVLI El SDKFLSS IGETLSGNLDMNQNVIQGLLI KENPEKSEAASVGYVQTLL 
E P LS KR I G ETHKKVAT HDVN I S S LQF HMM SVAGGRFRGH I DMNGYRVLG LGE P KNG EDAV 
SKDYLERWSSQLTIDKVEDKPITOPNKGKLLYSQGTSPKLESPLPI^LLTSGISGFTWK 
SASKSNDG S F PF S ALRHKETESDTDC FQ ITSTTLSGNQAGTYTWSLSLKVLVP S IFQ IEK 
PEVQLSLVYSYEDWLP I DN I FNMSQPRT I PLALLGQTMLAGQKYDI LELAAHQTNQTLMI 
S PNCSRFSLQLKQTNQFENS PVDFYIVHAAHSCHWSGF 

CPn_0191 232039 231314 

gJn$>-ABC Amino Acid Transporter ATPase 

q hfpvflgyqkregvmt i rvrniaysvnkkk ilix3vtf slergh itlfvgksgsgktm i 
lraILaglvqptqgdiwiegeapalwotpelfshmtvlgncthpq 
ffilxhij^ieevaknypdqlsggqkqrvaivrslcmdkhtiifdeptsaij3pf 
lks&lrdqeltvgltthdmqfvhscldriylidc^ 

%1 

CBnid92 232643 231984 

gin#-ABC Amino Acid Transporter Permease 

E^G^DHWLAIARLLLRGCGYTLCVSG IGILCGS ILGLLIGTVTSLYFPSKLTKLLANSYV 
TVrRGTPLFIQILIIYFGLPEVLPIEPTPLVAGIIALSMNSAAYLAENIRGGINSLSIGQ 
Wf 5AMVLGYKKYQIFVYI I YPQVFKN ILPSLTNEFVSLI KESS ILMWGVPELTKVTKDI 
V|K|LNPMEMYLICAGLYFLMTTSFSC I SRLS EKRRSYDN 

CPni0193 233144 232686 

*a~rgR-Arginine Repressor 

KL HGVFMKKKVT I DEALKE I LRLEGAATQ EELCAKLLAQGFATTQS SVS RWLRKI QAVKV 
ACSERGARYSLPSSTEKTTTRHLVLS I RHNASL IVIRTVPGSASWIAALLDQGLKDEILGT 
LAGpDT IFVTPIDEGRLPLLMVS I ANLLQVFLD 

Caru.0194 233162 234241 

gdp-O-Sialoglycoprotein Endopept idase 

EV^ETI KGNVFFSWFFMLTLGLESSCDETACAIVNEDKQILANI IASQDIHASYGGWPE 
LASRAHLHI FPQVINKALQQANLLI EDMDL I AVTQTPGL IGSLSVGVHFGKGIAIGAKKS 
LIG^THVEAHLYAAYMAAQ^0FPAIXLW5GAHTAAFFIENPTSYKLIGKTRDDAIGCT 
FDK^GRFLGL P Y PAG PL I EKLALEG S EDS Y PFS P AKVPNYDF S F SGLKTAVLYA I KGNNS 
SPSSjPAPEISLEKQRDIAASFQKAACTTIAQKLPTIIKEFSCRSILIGGGVAINEYFRSA 
IQTACNLPVYFPPAKLCSDNAAMI AGLGG ENFQKNSSI PE I RICARYQWESVSPFSLASP 

CPn_0195 234172 235785 

oppA-01 igopept ide Binding Protein 

YSGNSYMRKISVG IC IT ILLSLSWLQGCKESSHSSTSRGELAINIRDEPRSLDPRQVRL 
LS E I SLVKH I YEGLVOENNLSGKI EPALAEDYSLSSDGLTYTFKLKSAFWSNGDPLTAED 
F I ESWKQVATQEVSG I YAFALNPI KNVRK I QEGHLS IDHFGVHS PNESTLWTLES PTSH 
FLKLLAL PV F F P VH K S QRTLQS KS LP I ASGAFYPKN I KQKQW I KL S KNP HYYNQSQVETK 
T IT I HF I PDANTAAKLFNQGKLNWQG P PWGER I PQETLSNLQSKGH LHS FDVAGTSWLTF 
N I NKF PLNNMKLREALAS ALDKEALVST I FLGRAKTADHLLPTNI HSY P EHQKQEMAQRQ 
AYAKKLFKEALEELQITAKDLEHLNLIFPVSSSASSLLVQLIREOWKESLGFAIPIVGKE 
FALLQADLSSGNFSLATGGWFADFADPMAFLTIFAYPSGVPPYAINHKDFLEILQNIEQE 
QDHQKR S E L VSQASLY LETFHIIEPI YH DAFQ F AMNK KLSNLGV S PTGWDFRYAK EN 

CPn_0196 235906 237519 

oppA -01 igopept ide Binding Protein 

KLKSYSKERSFMLRFFAVFISTLWLITSGCSPSQSSKGIFAA/NMKEMPRSLDPGKTRLIA 
DQTLMRHLYEGLVEEHSQNGEIKPALAESYTISEDGTRYTFKIKNILWSNGDPLTAQDFV 
SSWKEILKEDASSVYLYAFLPIKNARAIFDDTESPENLGVRALDKRHLEIQLETPCAHFL 
HFLTLPTFFPVHETLRNYSTSFEEMPITCGAFRPVSLEKGLRLHLEKNPMYHNKSRVKLH 
K I IVQF I3NANTAA ILFKHKKLDWQG PPWGEP IPPEI SASLHQDDQLFSLPCASTTWLLF 
NrQKKPWNNAKLRKALSLAIDKDMLTKWYQGLAEPTDHILHPRLYPGTYPERKRONERI 
LEAQQLFEEALDELQMTREDLEKETLTrSTFSFSYGRICQMLREQWKKVLKFTIPIVGQE 
FFT[OKNFLECIW:;LTVNOWrAAFIDFt1JYI^IFANFOG[aPYHLQD::HFCyrLLIKITQE 
1 tKKHLRNQL [ I EALDYLCHOH ILEPLOEPNLR TALNKN TKNFNLFVRRT'jDFRFIEKL 

OPnJH'>7 .M75LJ ' t^HSJ 

oppA 'Jliqopopi uio tHndirid ruircin 

KMYKRK: ILOLKFf): 'K{* I KV [ FKMF' *RW [TLFLLF i.JLTt -^Y^^KHKQ.'LI IPTHDDPV 
AF:;t'KOAKRAMDL:J t AOl ^LFCX ILTHFT! iRFSNPLFLA rA'JRYTV^EDFC^YTrFr KDSAL 
W.'IDGTtMT.SKDrRNAWCYAOCN^PH [0 f FOnLNF'^TP'JSNA [T [HLDSPNPOFPKLLAFP 
AFA I FKPENPKI.FttirYTLVEYFPCJHN IHLKKNPrrrYDYIfCV;; [NS iklli r PDIYTAEH 
fjt.Nl^f ^KVTJWVf Xjf'WHO 1 ' I f'WEI.HKQ' Vi HYYTY E'VEOAFW[.CLNTKSPH UIDLQNRHRLA 

T( : r i >k h : ; [ 1 1-: E--.A i r r oo pact Li tr; i a po pnq y k ko k p lt vq f.k lvlt y p r \ d r lrc q r r a 

KH.KWjWKAAC 1 1)\ , I lA'x II .EYHLPVNKRK VQDYA r ATfJTf IVAYYWANI, [ :;EGDKLLONF 
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E It P IYYL3YDYLTQDF 1 717 V I YN A. jG AVDLK YT Y T ' ' 

CPn.Ot^H 2J9145 240V^h 

oppA-Ql igopept ide Bindmq Protein 

Q I EYY IMKMHRLKPTLK3L I PNLLFLLLTL33CSKQKCEPLGKKLV IAMSHDLADLDPRN 
AYLSRDASLAKALYECLTP ETDQG tALALA E5YTLS KDKKVYTFKLRPSWSDGTPLTAY 
DFEKS IKQLYFEEFSP3 1 HTLLGVI KNSS A IHNAQK3LETLC IQAKDDLTLVITLEQPFP 
YFLTLrARPVFSPVHHTLPESYKKGTPPSTY I SNGPFVLKKH EHONYLILEKNPHYYDHE 
r-pV^LF -T r>r iAr""~' J '* F^'''-'" n W T ^''' 5 W^AP[ "N^rOiTvT I ,'77FKr T -"/^citrTT 

kU'IftAkArt'jEAK.TIL-^ /_L/\^L^ l~i L .-.^N^I ZACL L ,twLKLTLOLr lr IQCMC 
YHCFIJCKRRQGDFFIATGGWIAEWSPVAFLSILGNPRDLTOWRNSDYEKTLEKLYLPHA 
YKENLKRAEMIIEEETPI I PLYHGKYIYAIHPKIQNTFGSLLGHTDLKNIDILS 

CPn_0199 241018 241983 

oppB-Ol igopept ide Permease 

KCL IGLSLVFSY IKNR ILFNLLSLWI VLTLTFLVMKT I PGDPFNDEGCNVLS EEVLQTLK 
SRYGLDKPLYQQYTGYLHS I AKLDFGNSLVYKDRKVTN 1 1 STAF P I SA I LGLQ SLFLS IG 
GG IALGTIAALKKKKQRRY I LGAS ILQ1SI PAF I FATLLQYVFAVK I PLLP I ACW3SFTH 
TILPTLALAVTPMAFI IQLTYSSVSAALNKDYVLLAYAKGLS PLKWI KH ILPYAIFPTI 
SYSAFLTTTVI TGT F A I ENIFCI PGLGKWF ICSIKQ RDY PVALGLS VFYGTL FMLSSLL S 
DLIQS I IDPQIRYAHGKEKKRK 

CPn_0200 241996 242868 

oppC-Ol igopept ide Permease 

EKKKHKLMENLSSAPSRSIWKS I IQNKMLVLGLTTL 1 1 LMLGALLLFWFYQDYEQTSLKlD 
I LVS PCSRFPFGTDTLGRCMFARTLRGLRLSLLIAT I ATLIDVCVGLLWATVAISGGKKI 
DFLMMRTTEILFSLPRIPIIILLLVIFHHGLLPLILAhTTITGWIPISRIIYGQFLLLKNK 
PFVLSAKAMHASTFHILKKHIXPOTLAPIISTLIFTIPNArYTEAFISFLGLGIQPPQAS 
LGTLVKEGINAIDYYPWLFFFPSLIMIALSISFNX.IGEGAKTLCLEEGSHG 

CPn - 0201 242810 243715 

oppD-Ol igopept ide Transport ATPase 
ASISSARGLKHWSKRDI>£DNYIJ^IKDLTITSTNP 

GSGKTTITKAILGFLPENCLIKTGSII^EDIDITKLSPKEl^KrRGQKIATIWNJ^ 
TPSMRIGMQI I ETLRQHHKMNKEEAYNKAMQLLTDVCI PNPKYSFSQYPFELSGGMRQRV 
VI AIALASQPKL I LADEPTTALDSMSQAQVLR ILRN IQQQKQAT I LLVTHNLSLVKELCN 
DICIIKDGKLIETGTVEEIFLSPKHPYTXKLI-NAVSKIPIKKTSSPILKNKFQ 
GL 

CPn_0202 243682 244500 

oppF-Ol igopept ide Transport ATPase 

VPTSNEYARWFMTTLi.SIKDLSLTIRGKKIIJWINIiKLIKGSYLTIVGPSGSGKSSI^^ 
ILDLLKPTTGTITFHMDPKI PRARKVQVIWQD IDSSLNPCMS IKG 1 1 SEPLN I IGTYSKA 
EQNKEIYNVXXJLVNLPKSVIiiLKPYKJ-SGGQKORIAIAKALVSKPEIX^ 
NQSLILDLFQTI KKEYQNTLLF ITHDMSAAYY IADTIAVMDQGSLVEHACREKI FSTPKH 
TTTQDLLDAI PIFSLI STEMEPSEEYELQVASK 

CPn_0203 244966 245802 

No robust homolog present in Genebank/EMBL as of 11/7/98 
IVPLPQKN>fiCETSCMNTYTFSPTLQKSFSLFLLEKLDSYFFFGGTRTQILVITPTWrR^ 
AKKRGCKVST I EKI I K I LS FILLPLVI I AF I LRY FLHKKFDKQFLC I PKVI SNEDEALLG 
SRPQAVEKAVRE ISPAFFS I PRKYQLIRIDTPKDDAPS ILFP IG I EI ILKDLCIDTLKQS 
NLFLKREMDFLG H P EEKALFDS ICS I EKDQEWMSLE SKKLL I T HFLKYLFVSG I EQLNPG 
FNPENGRGYFSE I STAKIHFHQHGRYGP IRSSGPIMKE I 

CPn_0204 245691 246002 

No robust homolog present in Genebank/EMBL as of 11/7/98 
PREWAWVFFRNKYSKDPFSSARS IWANPFFGTHHEGNIKIKGMGYQ IFTRLKKLGISFSS 
YNS INPNPYFFDEGCFVYWESQFKSALQDHG ILQKQTETFYRNT 

CPn_0205 246073 246327 

No robust homolog present m Genebank/EMBL as of 11/7/98 

I EDS IKGYGSASAFRNPPOLLLKFFLVCEELC ILTVATHRALLETPLALSFFKELKTKYV 

YRAKDILQLHNYKG FTILNTS PLCS 

CPn_0206 246346 247161 

CT203 hypothetical protein 

IVDRRSPACYDSIKSDAIGVSLLMDISHILEDLAYDEGILPREAIEAAIVKQMOITPYLL 
H I LHDATQRVPE IVNDGSYQGHLYAMYLLAQFRESRALPLI I KLFAFEDDTPHAIAGDVL 
TEDLPRILASVCNDDSLIKELI ETPKINPYVKAAAI SGLVTLVGAGKIPRDKVIRYFAEL 
LNYRLE KQ PS F AWDNL I AG I CTLY PG ELFY P I SKAFDGG LVDTS F I SMEDVEN 1 1 H EETV 
ESCIHTLCSSTELINDTLEEMEKWLEDFPIEP 

CPn_0207 247208 248617 

ybhl/sodiTl-Oxoglutarate/Malate Trans locator 

VNKKKRFLSLLFLTAVLLGIWFSPHPASINSNAWQLFAIFTTTIMGIIFQPVPMGAIAII 
GISTLLLTQTLTLEQGLSGFHNPIAWLVFLSFSIAKGI IKTGLGERIAYFFVSALGKSPL 
GL3YGLVITDFFLAPAI PSVTAPAGG I LYPWTSLSDSFGSSAEKGTQDLIGSFLI KVAY 
QSGVrTSAMFLTAMAGNPLVAALAGHVGVSLSWVLWAKAAI I PGLPSLFLMPI ILYKLYP 
PKIT3CEEAIRSAKLRLKEMGPLKKEEKTILMIFFLLWLWTFGDLLGISATTAALIGLS 
L L I LTN I LDWQK DV I ANTT AWETF rWFGALI MMAS F LNQLGF I PLVG DS AAALVSGLSWK 
IGFPLLFLI YFYSHYLF ASNTAH IGAMYP IFLAVS I SLGTNP I FAALTLAFASNLFGGLT 
HYG3GPAPLYFGSHLVT\'0EWWRSGFAL3 1 VN I VI WIG IGSLWWKALGLI 

CPn_0208 J4S935 250602 

pEkA-Fructose-6-P rho^phot ransEerase 

SVAVILMHPLYVDLDTriSSYSPPLPKEFOEAASLIAVPDTGHSKPWPGVKTLFPQTYH 
LPYLKFVQGENWMTPLK\'GWFS<^PAKXjH^IOGLFNSLKDFHPDSSLVGFVNhWrc 
LTriflKS [DITEEFLSKFRN.n IGFNC flTGRKK IVTPEAKEACLKTAEALDLDGLVr IGGD 
(;:;rrTATA£LAEYrAKRRrKT-; rVGVPKTIDGDLQHTFLDLTFGFDTATKFYSS I ISNISR 
DAL:jCKA}{YHFIKr^K.^R>\.:f(rALE( ALQT! Wtt I AL CJEE I AKKNLPLKTI LHKICSVIA 
f/PAAMEKYYGVI L [ VFC I I TFT PHI T III . TTR [ C^LilEYEDK E^RLJPE^QRLLKSFPAP I 
I F/j r f t IM DR D AUG NV Y V : ' K I . V \, Y L 1, 11 1 1 .V : 1 N H LOO Y V M IV P FNA I ; ' H F U J Y EGR SG L PT K 

FWrrY';Y*:LGYf i/v: ttA-RNrf :x;Y[;;Tn- ;i,A( t j fmk , //kl.ra[1'WKMFTVKQOADGTLQ 

I'K IKKYLVDIG'JT.\KRKi'\LVi'KIWALnj::Yr'!- U 'A'l/j I CTi'E'FMH: MJNFPPLTLLLNHK 
FWOPIiy ;cr LIP PITY 

<'I'ri_f).'l) , > J'.!V;V /".L.*/." 

N(; robunr tuimtUfHi m , <-nt m < ;. -u* h. mk/KMLU. -J 1 ; ot ll//'*>n 
Hl'Jltl n<l.KMLTY::r::T! L^VlWrA.f I MKKI .l/IYF'JhVJKKI'PV E A t'l'E'A( J[.AI AYEQN I 
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HLJMTVKILKYLSFPRSLLRTT.'jLWYRP 
CPn_02LU 2524^ 25M4G 

No robust: homo log present: in Genebank/EMBL as of 11/7/98 
YQKLWEREP.r^FKTIREKEHATrSTMLVELEALKREFAHLKDQKPTSDOEITSLYQCLDH 
LEFVLLGLOQDKF LKATEDEDVLFESQKA IDAWNALLTKARWLGLGDIGAIYQTI EFLG 
AYLSKVNRRAFC IAJE EHFLKTAIRDLNAYYLLDFRWPLCKI EEFVCW3NTDCVEIAKRKL 
CTFEKETKELNErSLLREEHAMEKCS rODLORKL"Dr : tELHDVSLFCFSKTPSOEEYQKD 

-v-VK;..t-:j^ :. ( ,7ME'i v- \i n *i. ri "mi- -:l " --c^t- 
CPn_0211 252765 252463 

No robusc homolog present in Genebank/EMBL as of 11/7/98 

EC VMS Y P D I SNVQA S S I Q S ALL HKTSDQIQQKRCFKQSTFVI LAVS LV 1 1 GS LFLLAGVA 

I LTVFS HGVLSLVFGVLG I VLGLLLLAGGVGLLV EEAKSLL 

CPn_0212 254081 252888 

No robust homolog present in Genebank/EMBL as of 11/7/98 
ELSYGVWS IYSEILSFSELTSCKHSLFPFGPIETASIRIHHVFNWIVCLI ILGTLFVC 
EjGMWLGVFSTYLIjGMSSMILGLLLISIGIJVLLKFKERYGLEP^ 

1 1 QMQDQ IADLARELD LEQ KKDTLIRG FS ARLDVLEGSKTEKKQ I LK IGVPRNLSE IQER 
AQEQNS I L EQCK EALL FRRKS AQE I FKKLYDRKAAFWRS YREDLWCYS E I HVSKKALSML 
Y I GDVF EGT A P H F LMEAYAJMC RT AKNL RNYVKV CVEDMR VNE EKKRAKQ LSVS ELLCCCT 
E I ETDLENETNLFTSDS EDVLEEYQ I HC I RVTMLHALWA IYNDEWSRK P I DTLDRVRAR 
MAVEDC I ETFEELQMC WHTKTLELEI AQLYVD I LLEA 

CPn_0213 254345 254190 

No robust homolog present in Genebank/EMBL as of 11/7/98 
ILWFSRVT FSNTNQIGIPRLELILPLWKKENDPFCFLFSRVEGTFIILNIK 

CPn_0214 255768 254446 

No robust homolog present m Genebank/EMBL as of 11/7/98 

FI^LKEDYERPTYCNIPPAPHPQRVPSKGCIASKVSTVVWALFII^IFFl^GSIAFLVH 

TSCGVLLGAALPILCIGLVIJLAVALIVFLCHKHKTRQDLDYYDQDLDSLVIHKKEIPNDI 

SELRVTFEKI^NLFQFHTKDFSDLSQEIJ^KFINC^IEKWLTLEDEVTKFLrVRDRFLETR 

RNFTT FGEQVKG IQSNIFDLHEEKSS L YLELYRLRKDLQ VLLNF FLL PPG I L KVDYDE I E 

AIKGLFIRLTSRLDKLDVKAOERKKF INEMSREFKEVEKAFD IVDRATKKLMDRAKKES P 

ARLFMGRTESLLEMKKNEEAIJCNQGIJDPENL^HPELFSPYQQ 

LISGTVTSGLTLEECENRMRAASTGLNALLVRiCLQFPGAIKSAYTEKLTEIEKE 

VIKSLELELIHKIKDIVTEET 

CPitfQ215 257039 255759 

Noi:lobust homolog present m Genebank/EMBL as of 11/7/98 

LTig&KKQVMSSAIARDCFPSPSPQPSSTLGVHPPK 

VN4rIFSFSVLWGLGGAGWLGSI*LLID3LIFFVSYHRK^ 

LR^EI^EVQEWS^LLDHWEDFKEWAQHKSQFATFEGDIJiFGREVEKYET IWKELDGR 
DWfiaLT ELKN IWG PLEFLRKKGDRLQC E I DKLRKEVMKVGKSGLKLAC ELTKFKS ALKDV 
K ^i^C YRDKRKVEKLEVF PEGYRRELLEVLKTRLSVEKE IQL FEEWS AFEEKLASLHR 
WF^EELQEALDKAKAErXDXQVRKSVV^ 

T F&S. EQ EKVL EEYEAL KAR I RXTIJIVKIXQVRANVAFVASTT DLLS ES E SLDGNDS VFED 
AtfjDDFLD 

CPjQ)216 257623 257174 

NoLrsbbust homolog present in Genebank/EMBL as of 11/7/98 
NKARTMNFVTFDRIQVDFIPEE)TSLJ^INSYIVAGGLLILGVVLSILSVICLDIGLVGLSA 
GAAFTLGLGCLIFALFLFS FSLILLLSQEKRVPDVLSLYLEKEVPQYET PLYKEDLESER 
DMgA : ISERLG I IEEKLRIAEKFRYSDSVFV 

CPn^0217 257881 258579 

ypSP* 

PK££JCLKGFLSVNELI FGFOTFSVVVLGVFFASRGKAWLTGWLSLLSS IMNVFVLKQIHL 
WGFEVTSADVYVIGLLTCLNYAREHYEKNDINDAMLCSWIS I AF LVLTQLHLFL I PSPN 
DS |Q£H FLALF S ST PR I WAS LVTL I FVQ I VD I KLFTFLQRVFS KKYF AMRST I S LLFSQ 
Lr fiTI I FS F LGLYGLVSNLCDVM I F AMLVKG IV ITLAI PTLTVTKAVLDRRSS 

CPri3>218 259064 258582 

N6*j?$bust homolog present in Genebank/EMBL as of 11/7/98 
IFLS'KKVFFESYEDFANVASSWPKSLRALVQGRYFVDSELKETPYRIHDFKKTPIHHRLY 
RSLPIISTIGGI IRLIEAHSGPIHPRDKMKYRFEVLQAVIEILGLGVLILVFDI IGCFLA 
FLVAI I LSLLLYCNSTFTCVQNLSFTERMLEG IGEAVNFLA 

CPn_0219 259348 260472 

tgt-Queuine t RNA Ribosyl Transferase 

GSSLALKFHLIHQSKKSQARVGQIETSHGVIDTPAF^/PVATHGALKGVIDHSDIPLLFCN 
TYHLLLHPGPEAVAKLGGLHQFMGRQAPI ITDSGGFQIFSLAYGSVAEEIKSCGKKKGMS 
SLVKITDEGAWFKSYRDGRKLFLSPELSVQAQKDLGADI I IPLDELLPFHTDQEYFLTSC 
5RTYVWEKRSLEYHRKDPRHQSMYGVIHGGLDPEQPPIGVRFVEDEPFDGSAIGGSLGRN 
LQEMSEWKITTSFLSKERPVHLLGrGDLPSIYAM^/GFGrDSFDSSYFTKAARHGLILSK 
AGPrKIGQOKYSODSSTIDPSCSCLTCLSGISRAYLPHLFKVREPNAAIWASIHNLHHMQ 
QVMKEIREAILKDEI 

CPn_0220 260660 261236 

No robust homolog present m Genebank/EMBL as or 11/7/98 
FYSFLKKKGIFYMSKESIRSYSEISTPTPIFRETPSKEGVAYKLQLRSPAKDCILRNRVS 
L KGALL RS I P F YGS FLGAK R I H S AWS AKD APCTTR V* ^HYLVGG L E LLC LGWVLAC KVLA 
TALKFLFSKASSKIKQMKWREKARNLAAKDTVCSIKEFCSVDLTSCFTRCFRLRNRWEE 
GASENQTVREIIV 

OPn_U22L 261621 2620M 

No tobust homoloq present in Geneb.jnk/ EMBL as ot W/l/^ft 
nAt.RYKYEIGrOMVNRYK.CSAEFSADHYYDDNLVRMO^KRNLRGUVPVCNEVCLFEENNL 
LE:>VMA:; I FIMG'IIL^L/JRLHSVWriTQDFKDCK I.j I ETHTALC [LETLGLCr IVLLIKIT 
I'V ELL I LFTPCLLOYFMY^AAYSDFPPt 

Cl'n_022;! ,if,2474 Jt.^M. 1 

•wfMk rjimil.it Lry to Bacr «r ioprwj*' >iPl <Oit4) 

i ; [■: K F E . K r 'W EK L , R KLN A F EI /TQ PECYR NRWV LM FX ' I , KG P F< * R TQH A K VW: * Y RC VH EA S LY E 
Kflt rFLTLTYnnKHLP'^YGCJLVKfjULOLi'LKPLRKM E'JPMK [ RYPFVi. TAYGTKLQRPHYHL 
I.I,:; 

i.*PnJ)2 2( .>f.2M5U i ) i * 



No robii'J!- homo Lqii pr*.»s*»nt m ' I^necinK, EMBL js ot 11/7/98 
' rTML IGR Y^DDQFTEATKNTPT 1 1 KLGFVRDNLEGLTNP 1 5 EIVSET3SS I KDSVLRSL 
P r LGS I LGCARLYrrTLrrrNDPLDETQEK IWTfT I FGALETLGLG IL I LLFKI I FVILHC I F 
HLVIGFCK 

CPn_0224 263402 263674 

No robust homolog present m Genebank/EMBL as of 11/7/98 
YTFKNPKKNKKMKPNS 1 I FLENTKHYPD I FREGFVRDRHGLMEASDWLLSTEITI IRS IL 

CPn_0J^'j 2b3s>4o Jo4^41 

No robust homolog present in Genebank/EMBL as of 11/7/98 
NSFTIKFLI^MTKNAINSQTTTPQP^TDAEPIASR^ 

LI S I P IPGLAAQVALCLGIVSLI LGI ALANIGFLCLLLRCKQVPQKPDTLPSESSKQPSE 
GSTPTALPWQAGEFLEKVQVSATPILLPKNKDEELSAKVMKEGAEAASSIKQAVLESTEK 
LIDARKQEESRREARKKIVAEEAEASRKRIQQQMAADQEALRKRKEEVAKRK 

CPn_0226 264545 264967 

No robust homolog present in Genebank/EMBL as of 11/7/98 
AI FNRKFWPYYANTLEFIQGTQSLCPLFKYGFVRHHYKGQLE IEDASHDWDFLEPPSTWK 
RTLLAAIPrLGSVIGLGRLFSIWS IREPQDSQEYKS I FWHTLCAVLEILGLGIVALILKI 
LATF IMAM PGLKR VAT FLFYS 

CPn_0227 265467 265009 

dsbB-Disulf ide bond Oxidoreductase 

KERFNI FVSCKLLKEIMMINF IRSYALYFAWAI SCAGTL I S IFYSYILNVEPC I LCYYQR 
I C LF PLTVI LG I SAYREDS S I KLY I L PQAVLGLG I S I YQ VFLQ E I PGMQLD I CGRVSCST 
K I F L FS YVT I PMASWAFGA I VC L L VLT KKYRG 

CPn_0228 266242 265412 

dsbG-Disulf ide Bond Chape rone 

VKDRADFLNLKEKFSCS ILKKENAFEFYVFCS I KQ LTNS S LRG PLNKK I L VLCTAMF F I V 
CFGFLIHKKHTILPPKAHI PTNAKHFPTIGNPYAP INITVFEEPSCSACAEFTTEVFPLL 
KKHYI DTGE I S FTLI PVCF I RGSKPAAQALLC I YHHDPRQAD I DAYMEY FHR I LTY PKEE 
G S HWAT P EVLTKLAEG LK I NSGRSVN PKG L EQC I AS GQ YNEQ I KKNNL YGSQ VLGGQLAT 
PTAWGDYL I EDPTFHE IERA IQH I RQ LQAVEGDHDD 

CPn_0229 266163 267560 

CT178 hypothetical protein 

NSKAFSFIJUEQENFSFKFKKSALSFTYNTANLTKSTCT^ 

LENIYRHFRYRFIjJOjNILPAFLGLLIJLCSPNTIuNYTQVDVIFSDRLCSCL^ 

KRSI^WIjGAPLGIWVTLFACVAGRSPTIFANOTLIGFAILAWC 

E£FSYNPSAGGRRAAVLFLSLI£WLFARYLTASSLGITSSQ 

VLSLAGSERRWHTRPKIVIATALALTGVI ILTLLPI ILHQLRYDCWLCLCLTIEPALAW 
FAYDETRATLRYISQFLGDKRALTPASFFGSEYYKHTLS^ERTVLPLRKAYKQAFEGIS 
FP IJTOLLAILVATVyVKVNSSMGLPTFPRNFLNICCWF I IVLFILAFAESLRHLRWMNLI 
FSAAILFSPVLFH I pvespmflp I IVTGLILI ILSIGKRRRTKRKL 

CPn_0230 263277 267576 

CT17 9 hypothetical protein 

RFKKALIYMSSQPLVTTSSSr^RYWLTGEEKVACYKKAFNH IWHGAPAI ILAAALLMFC 

IFGFVLGS ILLGAPLEGAS ILYDVILPWLLPSILVFVLLVLPLNIYAYSHHKOVLALHER 

ITQSNYKEIYDHCEKEKKTPNKKALSLYIESQVLVPE^SKRF 

ES LKHDEL IQKALERAKENI YM^^KNQR£KRDEREAiCKEAKNAS KTNPLWEGLGT 

CPn_0231 268996 268253 

tauB-ABC Transport ATPase (Nit rate/ Fe) 

PQAFVS IQDRGF SMLQAHRLCY SC DNQV I LKD AS FQ AS PGT I T 1 1 LGS S GVG KTTLFRLL 
AGFLPLQEGELLV^SPLNPJCDVAYMC^KEALLPWRTALKNMTLSTELGINTSHNALSNE 
RLEEI IHNFDLGQLLDRYPDELSGGQRQR I ALAAQCLSLKPI LLLDEPFSSLDVLLKEQL 
YQDIVAI^KENKTVLLWHDFHr^SCLGDVLWIKNKTLTPVPLDPSMRPLNNGLCFIK 
DLKKHLYT 

CPn_0232 270134 269232 

•similarity to 5 ' -Methyl thioadenosme/S-Adenosylhomocysteine 
Nucleosidase 

KKFLMRRFLFLILSSLPLVAFSADNFTILEEKQSPLSRVS I IFALPGVTPVSFEGNCPI P 
V^SHSKKTLEGQRIYYSGDSFGKYFWSALWPNKVSSAVVACNMILKHRVDLILIIGSCY 
SRSQDSRFGSVLVSKGYINYDADVRPFFERFE I PDI KKS VFATSEVHREAILRGGEEFI S 
THKQEIEELLKTHGYLKSTTKTEHTIJiEGLVATGESFAMSRNYFLSLQKLYPEIHGFDSV 
SGAVSQVCYEYS I PCLGVN I LL PH PLESRSNEDWKH LOS EAS KI YMDTLLKS VLKELC S S 
H 

CPn_0233 270439 270248 

No robust homolog present in Genebank/EMBL as of 11/7/98 
EKARTMFLGKVLLFLLRISRRSYVQE IG I FFHLETPDLKIVLCAFVSTF I WEMDVSLKN 
KGQS 

CPn_0234 271246 270548 

CT181 hypothetical protein 

F IMLOSCKKALLS IWSILAFH P I PGMGVEAKSGFLGKVKGWFSKKEIQEEARI LPVKDS 
LSWKRYDYTSS3GFSVEFPGEPDHSGQIVEVPQSEITIRYDTYVTETH PDNTVYWSVWE 
YP EKVD ISRPELNLQEGFSGMMQALPEGQVLFMQARQIQGHKALEFWIVCEDVYFRGMLI 
SVNHTLYOVFMVYKNKNPQALDKEYEAFSQSFKITKIREPRTIPSSVKKKVSL 

CPn_0235 271395 272177 

kdsB-deoxyoctulonosic Acid Synthetase 

VFVRYLLMKPEESECLCIGVLPARWNC^RYPGKPLAKIHGKSLIQRTYENASQSSLLDKI 

watddqhi idhvtdfggyavmtsptc;ngtertgevarkyfpkaeiivniogdepclns 

EWDALVQKLRi>5PCAELVTPVALTTDPEErLTEKKVKCVFDGEGRALYFSRCPIPFILK 
KATPWLHIGVYAFKREALFRYLOHfrrrPLSDAEDLEOLRFLEUGGKIHVCrVDAKSPSV 
DYPEDIAKVEQY tTCL^NAYF" 

f l-t\JU~',*-> ?12 1 5H / n 7t,t, 

[jyrG f ;TP -'ytir ht--t..iL>i* 

^R'^[YHMPFKOrFLTGf;wr::;LGKGLT/•A::LAL[LERORL^IVAMLKLDPYL^A/DPC:TMNP 
KF! fGE ['ATTDOGVCTDLDLGHYHRFr: 'JAAL.".RH.r;:;AT!y% I YARV T KREREGDYLG^TVQ 

v n -M ttne r iqv i ldaakeh:;pdvl r ir/vv :gdi e.^lpflea i rofrydh2EDGlnih 

MTYVPYLOAADEVKrJKPTOMT/f/n.l" J V\ \ r PIjAILGRrJEKPLTOEVKnK C'JLFCNVPNR 
AVi'TA/ [DVKHT [ YEMPI.MLAOEK IAf IT I G EKE ,KL ATVPENLDDWKVLVNQr. :^DLPKVK E 

' ;vvgkyvohrdayk:: i fealthaalp r / ;haai-. e r p r da e o cn ltme lho 1 " dac ' l v p< i fg 
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VROWEGK rAAAKFCREQG I PYFG ICLCMQVLWE'/ARNVLNLDQANSLEMDPNTPHPTVY 
VMEf^Dr'LVA'rGCTMRLGAYPCLLKPGoKAHKAYNESSLIQERHRHRYEr/NPDYIQSLED 
MGLR [VOTCPPQGLCEI LEVCDHPWMICVQFHPEFVSKLISPHPLFIAFIEAALVYSKDA 
AHV 

CPn_02 37 27 3741 274214 

yqgF Family 

nrLRMCAMoKPSSCKAYLGIDYnKKR rGLAYAAEPLLLTLPIGNI EAGKNLKLSAEALHK 

;ru l. ; v>..j / : f s t- ■ r .//r.rp; "VQAEFM 

*K';KTr :'..'[. r i.r^r:."' 

CPn_0238 274210 275836 

zwf -Glucose-6-P Dehyrogenase 

PCNHQKLRDFNFRNFLLFVI FASAGTKKE IKMTNWQET IGGLNSPRTC PPC ILVI FGAT 
GDLTARKLLPALYHLTKE^RLSTOFVCVGFARREKSNELFRQEMKQAVIQFSPSELDIKV 
WEDFQQRLFYHRSEFDNNMGYTSLKDSLEDLDKTYGTRGNRLFYLSTPPQYFSRI I ENLN 
KHKLFYKNQDQGKPWSRVI IEKPFGRDLDSAKQLQQCINENLNENSVYH IDHYLGKETVQ 
NILTTRFAOTIFESCWNSQYIDHVQISLSETIGIGSRGNFFEKSGMLRI>IVQNHMMQIJX 
LLTMEPPTTFDADEIRKEK IKI LQRI SPFSEGSS rVRGQYGPGTVQGVSVU3YR£EENVD 
PXiSRVETWALKTVINNPRWLGVPFYLRAGKRLAKKSTDIS 1 1 FKXS PYNLF AAEECSRC 
PI ENDLLI I RIQP DEGVALKFNCKVPGTNN IVR PVKMDFRYDSYFQTTT PEA YERLLCDC 
I IGDRTLFTGGDEVMASWKLFTPVLEEWDQDS SPS F PNYPAGSSGPKEADAL I ERDGRSW 
RPL 

CPn_0239 275863 276672 

devB-Glucose-6-P Dehyrogenase {DevB family \ 

KS ISMTNIG I ETMATLINFNDTNKLLLTKQPSLFI DLASKEWIASANCAIKQRGAFYVAL 
SGGKTPLEIYKDIVINfCDKLIDPSKIFLFWGDERLAPITSSESNYGQAMSILRDLNIPDE 
Q I FRMETENPDGAKKYQEL I ENKI PDASFDMIMLGLGEDGHTLSLFSNTSALEEENDLW 
FNSVPHLETERMTLTFPCVHKGKHVWYVQGENKKP ILKSVFFSEGREEKLYP IERVGRD 
RSPLFWIISPESYDIADFDNISSIYKMDIL 

CPn„0240 277861 276698 

No robust homo log present in Genebank/EMBL as of 11/7/98 

LWFWFSPSSESWKANSVVRSNFCYFT.ENKFVSPSESTEVMFSEIMKGRVPDIESLFD 

RPTDMMKTGFKAAQNU^FNSFGrLIMCFSQCKSC 

LG PTLG ALVY CA YKVYT LG KM I Y S I2iKAKAi<VLPJiPAQbAiTHRAAGVAT I RS S EEAVKAC 
KLYKSAM IG S LWS L I ASLAL I ALTAG I VLVLFFVAPGAAPVITAAMMGCCAAGGGALL I 
SLLGLWIAIVRKAKHQEACVGHLTNVVIJiTAVSE^ 

YG HL F SNE EVAQ L VQGG APGGG S RPS QHYGG S S DYQNRRGGNGNFGG SH FGGGGGFAGS H 
FG^GyPTAPTMPSAPPPFPPPAYDT IYG 

CPi£|)241 279372 278203 

No arobust homolog present m Genebank/EMBL as of 11/7/98 

I Fil^KFMSAM I S LS SS HEAS I ASNTQVRDVLVSLAMDEFVEHNTE I LP I KVFLARGTLS S 

TAJIpDLKDVVETEGEHHFQVYSNISLKMIYQRFFEKrFGIGCCPLLLVTDSHHTDPCGA 

LI^IFAAVLFTVIAIVFGPTLGrLCYSAYKIYQLTKKISSLSRTHTEVINSVQKSDPFr 

HR^VAAAAASQST IKACKVFRQSTL I FFVLGLI IT I SLAALIVGLVFALFFLDPOAPA 

VMTAAMIGCCAAGGTGIIjLSVIGFLJ_ASWSVQK^ IQMPY 

LPJ3?PGTKKVLTQSIRRYQQFFSDDEYRDIESEVPIJaRCTrpPPSYETLFHEEGSDGSSN 

VI|RESPPAYSTIDSSNSPFPSSSPPPYYR 

CPis0242 279975 279487 

No '-robust homolog present m Genebank/EMBL as of 11/7/98 

KSLKYCSLYQFSOK PTVILMACS I FFRMSQGDYDDEPLSKKTACLWDTMLYPVIAWCA 
WSWLLI LKVLFLLLSFP FKLC SASS AL PG ERVSLGSH FKC LYGGGLPYLLACLL IVPV 
IGTA&HGFIISHRTSEDARLSSAIVFMQAPILQLAGMSGLIKP 

CP4^0243 280609 280133 

No;; robust homolog present in Genebank/EMBL as of 11/7/98 
IN^LVFLLKFVKGRI IMACS IGYH LCN ANE PDRFVAS KVAL VAD ILL Y PFMAVI CAW 
FAV/LMVVKLLFLAIKFLVNTCIAACKSRPLPSCKENFCCLFGPKDKPGPSD^^ 
I IfEEil YSTI ITVQSDTNRLRYF I ISPAYQVGSTAI INW 

CPri|§244 280906 281556 

adk^Adenylate Kinase 

GAPi^TKGSVFI IMGPPGSGKGTQSQYLANRIGLPH I STGDLLRAI IREGTPNGLKAKAY 
LDKGAFVPSDFVWE I LKEKLQSQACS KGC 1 1 DGFPRTLDQAHLLDS FLMDVHSNYTVI FL 
EISEDEILKRVCSRFLCPSCSRIYNTSQGHTECPDCHVPLIRRSDDTPEIIKERLTKYQE 
RTAPVIAYYDSLGKLCRVSSENKEDLVFEDILKCIYK 

CPn_0245 281627 282499 

ydhO- Polysaccharide Hydrolase- Invasin Repeat Family 

TCOKEIMKHYLSFSPSADFFSKQGAIETQVLFGER^/LVKGSTCYAYSQLFHNELLWKPYP 

CHSFRSTLVPCTPEFHIHPNVSWSVDAFLDPWGIPLPFGTLLHVNSONTVIFPKDILNH 

MNT I WGSGT PQC DPRHLRRLNYNFFAELL I KDADLLLNFPYVWGGRSVH ESLEKPGVDCS 

GFINILYOAQGYNVPRNAADQYADCHWISSFENLPSGGLIFLYPKEEKRISHVMLKQDSS 

TLIHASGGGKKVEYFILEQDGKFLDSTYLFFRNNQRGRAFFGIPRKRKAFL 

CPn_0246 282955 2S2551 

rs9-S9 Ribosomal Protein 

WAKSTI0ESVATGRRK0AVSSVRLRPGSGKIDVNGK3FEDYFPLEIQRTTILSPLKKIT 
EDQSOYDLI IRVSGGG IQGQVI ATRLGLARALLKENEENRQDLKSCGFLTRDPRKKERKK 
YGHKKARK3F0FSKR 

CPn_0247 283430 2S2969 

rll?-L13 Ribonomal Protein 

D:;Y I rMEKRKDTKTTIVKSSETTKSWYVV'DAAGKTLCPLSGEVAKILRGKHKVTYTPKVA 
MGDf^V I V r NAEKVRLTGAKKGQK I YRYYTGY 1 5GMPE I PFENMMARKPNY 1 1 EHAI KGMM 
H<M'RLf;KKULK::LRIVKGDSYCTFESOKFrLLDI 

i:Pn_[)^4H .?K445 i JSltjbO 

y<:tV/yN>A ABC Transporter •YIT.ibf- 

!<:;H« ;YLF::RVRYA'l'r JP:)FP r j PAt'KK 'RKNArLPT irK"PL<AM, r jLL [ EAKNL'JKT IQQQN 
fjMI::i f ;n>V: ;[.::[ J bV;i'T[:J rT^AiAJNtlKTTt.LHLLOTLCVP^r^^LRFFDKDLKNQDLA 
NKI<rJO!IICI-V]-VjNFYI,[i:DD1VLKm/[>irAL[ARKIirJKOaPVA'TRALELLOLVNLEDKV 
irn<(;::KI.::o;KK(JRV'\ f ARAL tNCPA t LLADF.PijGNLDEETFSEQ THNLLLCQAjALCG IL 

r vti iNKf ii .A; '.\ jf :: ;i<n ;vi ;m iklfh in.' 



LEVMKFEFSVALKYL I PCRGRLYSAIVS L FT/G 1 1 3 LWWLo [ VF 1 3V I HGLEQRW I EDL 
S0LHSPrTILPSDTYY3^YYQtDKHS3L^rrTTKTLGEKIASPQVDPYDPESDYLLPET 
FPLKDCDLCCCXJKDPVKKTLESLGPYLQSQHGKV r EFEQGVGYLD I KTSLKLQKPQPRNL 
THFLTYPSKLSYEDKVLPYDETDYTSAELNPFNRSPSGWQQDFHHLEELYRGASI I LPST 
YKDSGYKVGDTGVFSTYS I ENEKETOYTVH 1 / IGFYN PGLSPLGGRTVF IDPDLARS I RSQ 
S EG LGM SNG FHL F F PNTKR IVFVKKQ I EN ILTSLGVDDYWEISSLHDYDYFQPILDQLQS 
DQVLFLFVC I L I LI VACSN IVTMSMLLVNNKKKE IG I LKAMGTSSRSLK 1 1 FACCGAFSG 
ACGWIGTIFAIITLKNL^FIVKALfTfUXIRETFNTAFFGQNLPNSVH 
' r \ v"-\r r"? v ~ t ,'\W"~r" ■"~ 

rl33-L33 Ribosomal Protein 

KDSSMASKNREI IKLKSSESSDrmnV^KRKTTGRLELKKYDRKLRRHVIFXEAR 

CPn.0251 286036 287559 

•conserved hypothetical protein 

S PDSCLPWMSPFKK IVNRLLCY I S FQ KES RTL P 1 1 1 R E P RMTTKSLGSFNSV I SKNK I H F 
I S LGCSRNLVDSEVMLG I LLKAGYESTNEI EDADYL I LNTCA FLKS ARDEAKDYL0HL I D 
VKKENAKIIVTGCMTSNHKDELKPWMSHIHYLL£SG^ 

EMGEVPRQLST PKHYAYLKVAEGCRKRCAFC 1 1 PS I KG KLR S KPLOQ ILKEFRI LVNKSV 
KEIILIAQDLGDYGKDLSTDRSSOLESIXHELLKEPGDYWIJ^MLYLYPDEVSDGIIDLMQ 
SNPKIiPYVDIPLOHINDRIU<C«RRTTSRE0II^FLFJCI J PJVK^ 

TQEEFQEIADFIGEGWI^LGIFLYSQEAbTTPAAELPDQIPEKVKESRLKILSQIQKRNV 
DKHNQKL IGEKI EAVI C*IYH PETNLLLTARFYGQAP EVDPC 1 1 VNEAKLVSH FG E1RCF I E 
ITGTAGYDLVGRWKKSQNQALLKTSKA 

CPn_0252 288112 287576 

CT144 hypothetical protein (frame-shift with 0253?) 

ATSTVCALWILC/TYQSHDDAASCSFRRACRFGRYWLGGV^ 

YIDSSQTWMMRFQASASIPRLFRISIF>rrKHGDWIDNGTGGELI^VAYE 

IEI^AMSTCSGTSYYRARPMCWL^STYYAVT^PGYFVLE^ 

CPn_0253 288474 287950 

CT144 hypothetical protein {frame-shift with 0253?) 
FCGGRLMSSSIPTTQKITISIPTFVRFNIESI^TDEQ^ 
VDAIX^LICQ^roL t SVG<^INITPC/^F^r^MVFSGRV^^SNSPFSYQDSL^^ 
EQPC^YVPYGYYKLTRVMMMCRAAI^GGHVGSGDIGWGESMYLGI SS IKRQHKVQ 

CPn_0254 239268 288459. 

CT143 hypothetical protein 

IPMKTLGVKDQNLFIDQATLSVERNVRIENNLETrRDIJCVL^^ 
QI^ATTI^DGFNIYSKTDVSQTPVCNNlSDPQSARDALTFSYyRKTGCOAAN^ 
GYYVAPNTT I ETHVAA ITSKSVSRNAT PD FSRYAD I EP WKLKQVG I YO VTMQLTRWSGQ 
HEGDNS ATL I LNFVSGNNKT LL CT SDTRGGY S S DRT SVAVT A I F SVT EL VS S P PYD YPW I 
bn-ESTIWMNL^LSTCVIWFPFPSNFVEVD 

CPn_0255 290183 289329 

CT142 hypothetical protein 

TIXKVIMKNNINNNECYFKLOSTVIXjDLLAANIJ^ isstetfsvqgnatfkdq 
VS ATX^LT SGTTYNLNAQNFTS S Q I S I DFKNNRLSNC AL P KEDCD PVP ANYVR S P EY FFC S 
KPLIGDFDFNSGESYLPLTGSEYTLYQSRNVNSIFRFIGWKQSTRELTVGGNTAIOFLAA 
GTY TVS FTVGKRV^?hJW3^KXjA I Y IN1WLGQVQC EST I YSGGGYAT IGTLGTS I YRASVD 
VAPNPNDPNASDRYRAG IFYLSNGGS SAG IGNY S FSLLYY PDDRG 

CPn_0256 291282 290398 

CT144 hypothetical protein 
FCGGPJJ4SNPTPKTKISIPTFVRFmQSI^TEDQKKTT 

DGGLTCQS DLT IQKDI N I R PTSTNSMVFDGRLNLSNS PLSYKNSQGQD I TDYEKMS SGKP 
QEYVPFGYYKRTQIMMAQRAAHSSGYVGGGSVPSGSYVPWNKFDQTSTQKTSGTEIYIDP 
NDSTKLVF EVNNKVPKLFR I SV IMAKHG SWLDNGTG AD I LLAANEYEOGGGR INVTDLAM 
TT SRGS S YYETRP LQWCVTYYAQNNGY FT FQNRAGGG LRVS F FSWN I VALPYVE 

CPn_0257 292136 291267 

CT14 3 hypothetical protein 

GVVMKRRNLOKrLPNASTPSTNVAENTG I KDQNLF LDQ AT LNVDGNVD I ENF LETRDLKV 
ADTITSPCEFTVGGGLSAESSQFKATTLSKGLEITSEDQIXjRVPKFTNVSDPQSPRDALT 
YNYYR>m^QAI^YTYYSSSQPTTVGKPI ETVCQNPNPETYRI S ASAK IYDAVTRFPY I 
QFKAPG IYQVT IQ IRRESGQHSGLDNPNLYLNLMIGNNKTLLCASDTRGYSGGHRTS I AV 
TGT FTLTE I VAT P PHDYPWLFLETT I GLD IKSMSTC VTWF P FQ ANFAEVD 

CPn_0258 292534 292133 

CT142 hypothetical protein {frame-shift with 0259?) 
C F S FCRLGS KF EK I TLGGNTAI QLLAAGTY I LT FT I GKRWGWNNGWGGS I RL FEG KYTGD 
GTMLCGSTVYSGGG YST IGYLSTAVYRDH SD I DPD PNNPSDKYMNN FL FVRNG DHS AV IG 
NYS FTLLY FAGDKV 

CPn_0259 293031 292441 

CT142 hypothetical protein (frame-shift with 0259?) 

I YFVFKRKTYNYF I EMTTTNNODNNECYFKLDSTVDGDLLASN IQT FDKQAKG I SSTETF 

SVQGNATFKEKVSATGLTS ASTYKLNATG PAPSS IT I DMKNNRLSN PALPKNPCDPVPAN 

YVRSPQYFFCAKPIEGTFMFDGSSRYLPITGDGSNYTLYQSSKAGDVFRFVDWDQNSKKL 

HLCGTQPYNFLLQEPIS 

CPn_0260 294090 293548 

secA-Protem Trans locase Subunit 

AYLDFSKRSCVEEDHVSKKtNRfJDLCPCGSNKKYKQCCLKKEEQTARYTTEGKFKFSAEV 
LSASEQGEACDNCTKLFQRLSQSLTSEQKAAVGKFHQrTKMKEVMSKKALKKAQAKEEKL 
^EKI^HNFEII^KTGENLAPPMESTATLNODTNFVCEDFIPTQEDFRISENSOKPPVEE 
D 

CPn_J)2*A 2'M27.1 2')5 0U 

yrJ,iO-pC"Lnop "upH-r t .im t ly ATPa<5^ 

YriFNHPFIVFWSTLLLNPPWMKAfJKRIErJLVRKALYTffTMLANfiKKIWAI^OGKDSLTL 
LLMLKAICGRGFf'DLDLMAVNrO r JKYf:'"GAEVNKPYLTI' [CDOLC I PFRT t PHPYAPETP 
E("YPC.'jQAPRRLLFOA(\KE ICA^A [AF V JI IHRDULVOTALLiNLLf IKAI-'.FACIMLPVI.DMVIIF 
r IVT I LRPLI FTPEFWI RKFAKKrjf IFAPVrCROI'WIH.HwKAEO^t .Kr*t.LuEVFPL/\iU(N t A 

LAryF.Mcrjr.K^OKt 

I. r FNINKEVKVYLVLMNKRLK [ I LTNtAJ ', [TAKf IM^^t.V'IAI.LFAN KID IY tAAPOARO:; 
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< ;K; ;MA I : XN0VVCASPYAYPQPVKEAWAVGG3 PTDCVRLGLRTLFE3V3PDLVISG INCG 
rJN [CKNAWYGtJT IGAAKQALVDG t P3MALSQDNH I S FFQQDKAPEILKALVI YLLSQPFP 
C LTGLN I NFPT3 PGC3SWEGMRLVP PGDEFFYEEPQYLGSVNKNQYYVGKISCVR IGEHP 
nEELACMLENHISVGPrFSONSPIGLMTLEEFOKTQENFNASLLSSELTTKIF 

CPn_U26 3 296174 297136 

yqtU hypothetical protein 

3TAL3RR KLRVR PPSLAKYAFPGFRMSHG PRPTKFSFPLYFSKTLSWFILGGFLAACGVQ 

:r;:. ,■: ';;■:.[ i . ; /fi;;/;.,,,!' '^r n ;:'L\. *, i r Yr [""vrTMLT*" 

- ;.v: , i r.* > i>: w f / ;m '; • • rvrr g " m ? tvv: , ; , \ : r tvgv' - 1 . i r D f f v , ' vdctte i u: 

I [ INKKKGYTVGQIILFVNFFIFALSGIVYKNWHTAFVSFLTYGIATKVMI>IVILGLEDT 
KSVTI ITSS PRKLGHI LMETLG IGLTY IHAEGGYSGEPRNLL YVWERLQLSQLKE IVHR 
EDPSAF I AI ENLHEVINGRRT 

CPn_0264 297730 297155 

ubiD- Phenyl aery late Decarboxylase 

MKRYWG ISGASGVI LAVKLIKELVNAKHQVEVI I S PSGRKTLYYELGCQSFDALFSEEN 
LEYIHTHSIQAIESSLASGSCPVEAT III PCSMTTVAAI S IGLADNLLRRVAEVALKERR 
PL I LVPRETPLHTIHLENLLKLSKSGATI FPPMPMWYFKPQSVEDLENALVGKILAYLNI 
PSDLTKQWSNPE 

CPn_0265 298632 297730 

ubiA-Benzoate Octaphenyl transferase 

KI I IVRIJ^FI^LVNFKYSIFSILFLSASTVFALSINEISQNLSFKEGFKISVFGAIAFV 

FARTTGIVWCCIDRFIDKKOTRTSKRVLPANLVSLNFAWVLSLFCSFLFLFLCKILRIF 

SLGIASLTLMIVYPYMKRVTFFCHVGI^LVYTVAILMNFCAFAES^ 

SVGI^IAANDIIYAIEDTEFDREEGLRSVPAHYGH^KAVEIAKVNLWVSYLAYIFSGFVG 

SLDKEFYFTAI I PLWILKWRMYSNYSKKDQEGESKFFIJWIAIALSFLVSMTLFWSLS 

R 

CPn_0266 299181 299876 

No robust homo log present in Genebank/EMBL as of 11/7/98 
IMALDEI^QNNPS(X!IASSTSGTSKI^ 

LG LSVP LSG I LGT FAVTVG AVL F ITG LT X LVR K5 LG I EQ KNEDLNFLK I KT PT PPAR PLM 
SKFSVTCSTTS I VLGMALL IGAWSVPFLTGYL^LGIX^GLVGLGTALFVAGLARMSPRS 
LATOEGSGSADSQSNI VG IGEPKAAQEQKWYKMAVVRGEDG I PTAI RLT PEK 

CPn_0267 300122 300910 

No robust homolog present in Genebank/EMBL as of 11/7/98 
V5.J^SLJ^KTNALLNQ PE PAVCLNAWDPKYINQDRKT FACTVTLLVIATLMI LTTGVIVLL 
A^SPGLSVLVSTIIGTSVTTLGTALFIIGLVKLIKKSLAWIQYQKYFQEVVKQKYEPFS 
IP'pnaiVHKLTSCLPSPLDIESPSPEASTFVSKIJlIA^^ 

TGY^QLALC VGFACLGTALFVGGLAGLRTHSL I AQG IMYLYLTYYLSSALEERNETVKDQ 
RNE3NTYLTEECKQQKREKALLE 

CP^_|0268 300914 301318 

Nd" Robust homolog present m Genebank/EMBL as of 11/7/98 
KQjWA^SLMSQCQSSSTSTWEWMKSFVPNW^ 
. NAlQSNP PGT STPNVENG IDDIJ^PIiLGQPNEQ^ANNPGTSGSNPTSLPAPERLPETEENS 
QEjPEQGSQNNEDLIG 

CPEJ3269 302468 301476 

Dipeptidase 

VA$*feVMT IDMHCDLLSHPHFCRKDPAVRCSPEQLLSGGVRQQVCAI FVPHSRGEPNCDK 
QNSLFFSLPNCYPD IGLLSYEEEENGSSSQKKSLSLIRS IENASALGDDTAPLGTLLAKL 
I HIT KQGP LAYLG I VWKG DNRFGGGT EAP KRL SNDGKVLLD I MYELGVP I DL S HC S DKLA 
EDiiliDYTADKLPNLAV I AS HSNFRSVLDHRRNLVDAHAKE I VRRKGVI G LNLVRSYVGDS 
LGjpLEKHVLHAENLGI LSS IVLGSDFFYANEDENFF FNECSSAEAHPVLNQLIHRIFSKG 
KAE^CLSSRAEKFLKQVIVEQVNPKITDVKL 

0^0270 303343 302468 

ywlC-SuA5 Superfamily-related Protein 

S I F^VI VPD KKAQ I T FSLP EVMSAI HQG K IVALPTDTVYGFVLS LYAS EAEERLYALKDR 
EPSKAFALYVNS I EDI EN I SGYPLS PTAKKLAOLFPGAITLWKHRNPRFPKETLAFRI V 
DH^^EXVDHCGTLIGTSANLSEFPSALTAQEIFADFADHDLCIFDGPCSHGLESTWA 
SDPIbYI YREGL I SRSVI ENIAGTEAXIFHRTSHAFSKHIKI YTVKNQEQLVS FLSGSLDF 
KG^US. EHPKPKN FYTRLREALKKKTP S I VF I YD I NT S DYPELF PFLS PYY I E 

CPn_027l 303628 304362 

Lysophospho lipase esterase 

KLMTDYSFFRRKIGNI EAIECPGNPQDPI I ILCHGYGSLADNLTFFPS ICSFSKLRPTWI 
FPNG I L PLENDFRGSRACFPUvlVLLLQELSRLYANGVGNLQEKYDELFDVDLETPKEALE 
ELILNLNRPYNEI I IGGFSQGAILATHLVLTSQNPYAGALIFAGARLFNQGWEEGLKQCA 
G/VP F LQ SHG Y EDE I LPYH LGAH LNDLLLT KLNGQ FVS FHGGH EI PS WFQKMQVTVPNW I 
DPARG 

CPn_0272 305272 304340 

dnaX-DNA Pol III Gamma and Tau 

FNRQS DATYATWVMHL EEENCGWEALLRKVYHQEVP PAI LLHGFTL PVLQDKAEQLASE I 
LLSSSPG3EHKVSQKIHPDIYQFFPEGKGRLHSIDLPRGIKKQIYISPFEANYKIYIIHE 
ADRMTLAAISAFLKVFEEPPKHAVIILTTAKVQRLPKTI ISRSLSIFIERGEKILCSKET 
FSYLFRYAQCEIPVTEVSQIIKESSETDKQVLRDKVQRFMEVLLELYRDRYTLNLGLKAS 
ALNYPEHVKEILQLPLLPLDKVLLIVESACRSLNNSSSAASVLEWVAIQLVSLQYKEKEL 
VSVSPGQDLSN 

CPn_0273 305853 305227 

r:dk -Thymidy late Kinase 

G3 rVFTVIEGGEGGGKSSLAKALGDQLVAQDRKVLLTREPGGCLIGERLRDLILEPPHLE 

lsrccelflflggraohiqevripalrdgyrvrcerfhdctiv/qgiaeglgadfvadlc 
rikwgptpflpnfvllldipadrglqrkhrokvfdkfekkplcyhnriregflslasadp 
::rylvldare:ila:;lidkvmlhtqlglct 

f."lTi_0274 30H36B 305852 

>jyrA ON A Gyr-c;*' Subunit A 

l-^TIPMFNKDEr IVPKNLEEEMKESYLPY5MSVI L3PALPDI RDGLKPSQRRVLYAMKQL 
JAKHRKCAK r r -GDT. l .L;DYHPH(.lE3VI YPTL\/Pt^AQNWAMRYPL\/DGQ''3NFGS IDGD 
! '1 'AAMR YTEARLTI l.'JAMYLMEDLDKDTVD I VPNYDETKHEPWFPSKFPNLLCNGSSG I A 
VGMATN I PPHNUJIiLIEATLLLLAWPOASVDEILQVMPGPDFPTGG I ICGCEGIRSAYTT 
GRGK IKVPAULHVEENEDKHRE5I I ITEMPYNVNtCJFL I EQ [ANLVNEKTLAGISDVRDE 
: IbKrX ; r P'/VLE I KKGF^E [ I t NRLYKrrDVQVTFGAMMLALDKNLPRTMS I HRMI^AWI 
RHKKEV f PRr<TRYELNKAETRAHVLEGYLKAL:5C[<DALVKT IPE3GNKEHAKERIIESFG 



I i 

FTEPQALAILELRLYCLTGLEAEK IQKEYnCLLNKIAYYKQVLSDEGLVFD: rRNELQDL 
LKHHKVARRTTIEFDADDtRDI EDt ITNEJVI [TI3GDDYVKRMPVKVFKEQRRGGHGVT 
GFDMKKCACFLKAVY3AFTKDYLL I FTNFGQCYWLKVWQLPEGERRAKGKP I INFLEG I R 
PGEELAAILNI KNFDNAGFLFLATKRCWKKVSLDAFSNPRKKC I RALEIDEGDELIAAC 
H IVSDEEKVMLFTHLG^VRFPHEKVRPMGRTARGVRGVSLKNEEDKWSCQ IVTENQSV 
LIVCDCGFGKRSLVEDFPETT'JRGGVGVRS ILINERNGNVLGA I PVTDHDS I LLMSSQGQA 
[RINMQDVHVMGRSTQGVR L"«/HLKEGDALVSMEKLSSNENDDEVL3GSEEECSDTVSLR 

r ' r l N , ■ i •■'..* '• ^ 

FMDPKEKNYuAJA IT/LtGLwAVRERKJMY XGDTG ITGLriHLVi'EVVLti3 1 DEAMAo iCS 

RIDVRILEDGGIVIVDNGRGI P IE^ERESAKQGRE^'SALEVVLTVLHAGGKFDKDSYKV 

SGGLHGVGVSCVNAI^EKLVATVFKDKKCYOMEFSRG I PVTPLOYVSVSDRQGTEI VFYP 

DPKIFSTCTFDRSILMKRIJIEIJVFT^RGITIVFECDRDVSFDKVTFF^ 

QNKESLFSEP IYICGTRVGDDGEI EFEAAUJWNSGYSELVYSYANNI PTRQGGTHLTGFS 

TALTRVINTYIKAKNIJ^KNNKI^AL^ 

SVAQQWG HALT I FFEENPQ I ARM I VDKVFVAAQAR EAAKKARELTLRKSALDSARLPGK 
LI DCLEKDPEKCEMYI^/EGDSAGGSAKQGRDRRFQA ILP IRGKILNVEKARLQK I FONQE 
IGTIIAAIGCGIGADNFNLSKLRYRillllMTDADVDGSHIRTIXLTFFYRHMTALIENEC 
VYIAQPPLY^CVSKKKDFRYII^£KEMDSYLLMI/^^^ESS^LFKSTERELRGE^ESFI^^ 
ILDVESF INTLEKKAI PFSEFLEMYKEGIGYPLYYIJVPATGMQGGRYLYSDEEKEEALAQ 
EETHKFKI I ELYKVAVFVD IQNQLKEYGLDISSYLI PQKNE IVIGNEDS PSCNYSCYTLE 
EV INYLKNLGRKG I E I QR YKGLGEMNADQLWDTTMN PEQ RTL I HVS LKDAVEADH I FTML 
MGEEVPPRREF I ESHALS IRINNLD I 

CPn_0276 311140 310793 

CT191 hypothetical protein 

DM F LKRKKRGG S QVQNKRT AS P I KHAKHYLHNYLQELOKIMAARPHDAI DAWNQVFRDKY 
KGMSQAIGFRDH ILLVKVYNSSLYALLKQTPQNDLIMSLYQVASHVQ IREIQFLLG 

CPn_0277 312003 311404 

No robust homolog present in Genebank/EMBL as of 11/7/98 
NISIFYPKYFIEGKEVLIKNLPPLIFYGVII^IIINVRAPAFGITSVQOFSTNFQAAIPIL 
NI VIGCSRI SSTYAEDIEEVAQEKLEKSTHSKSSTSVNLWAHRVRGWE ILGGG IVI LAL 
EITALVLQVI IKLIKCLIDVLCVCLFGLGVCWAI IGAIAFCVWWKYLGFCSQGEELE 
P IEVKTLISPDKPYPTWYV 

CPn_0278 312884 312060 

♦conserved outer membrane lipoprotein 

RDSMKKKLSLLVGLI FVLSSCHKEDAQNKI RIVAS PTPHAELLESLQEEAKDLGIKLKI L 
PVDDYR I PNRLL LDKQVDANYFQHQ AFLDD EC ERYDCKG EL W I AKVH L E PC; A I Y S KKH S 
SLERLKSOKKLT I AI PVDRTNAQRALHLLEECGL I VCKG PANLNMTAKDVCGKENRS INI 
LEVSAPIiVGSLPDVDAAVIPGWAIAANLSPKKDSLCLEDLSVSKYTNLVVIRSEDVGS 
PKMIKLQKLFQSPSVQHFFDTKYHGNILTMTQDNG 

CPn_0279 313546 312875 

* Possible ABC Transporter Permease Protein 

KKIMQSDLIQ ILLKETVNTLYMVSTAFFFSCAIGGMLGLGLFCTS PKSLNPKKSLYAT IS 
MI LSFLTAI PFAIL IVILFP ITRWIVGTSLGPTAS IVPLTIGAIPFWTIWDAFRNSAL 
NYLESAVALG I P KRN I LFG ILL PESYPQL I FSLKSLWHL I SCSTLAGFVGGGGLGQLLL 
QYGYYRF EWSVTT SVL V I TL VL I ESVR I LG DFWG RR VL KY RG IL 

CPn_0280 314593 313550 

dppF-Dipeptide Transporter ATPase 

IKGEAWLVSEQHSP 1 1 SVQDVSKKLGDHILLSKVSFSVYPGEVFG IVGHSGSGKTTLLRC 
LDFLDMPTSGS I SVAGFDNSLPTQKFSRRNFSKKVAYISQNYGLFSSKTVFENIAYPLRI 
HHSEMSKSEVEECVYDTLNFLNLYHRHDAYPGNLSGGOKQKVAIARAIVCQPEWLCDEI 
TSALDPKSTENI IERLLQLNQERG ITLVLVSHEIDWKK ICSHVLVMHQGAVEELGTTEE 
LFI^SENSIT^LFHEDINIAALSSCYFAEDREEVLRLNFSKELAIQGI ISKVIQTGLVS 
INILSGNINLFRKSPMGFLI IVLEGEVEQRKKAKELLI ELGWIKEFY 

CPn_0281 315033 316103 

"dhnA- Predicted 1,6-Fructose Biphosphate Aldolase (dehydrin 
family) " 

ISLRPiiTLi^I^IHDILGNDDENLLSYCGKHITKDKLTLPSHDFVDKVFGLSDRNNRVLRS 
LQTMFSHGRLANSGYLS I LPVDQG IEHSAGASFAINP IYFDPENIVKLA I ESGCS AVAST 
YGTLSLLSRKYAHKIPFMLKLNHNELLSYPTKYHQ I FFTQVEAAY SMGA VAVGATVY FGS 
ETSNEEIVAVSNAFAKARSLGUVTVLWCYLR^PAJ^ANGVDYHTAADLTGQADHLGATLG 
ADIVKQKLPTCQGGFKAINFGKTDERVYSELSSNHPIDLCRYOVLNSYCGKVGLINSGGP 
SG KNDFTEAARTAVIN KRAGGMGL I LGRKAFQR PLS EG I QLLNLVQ DI YLDPN IT I A 

CPn_0282 316084 317529 

xasA/gadC -Amino Acid Transporter 

ILILOSU^FSKKVFMHSHSKPTKPLGTFTVGMLSLAWIGLRNLPLTAKHGLSTLFFYGL 
AVICFMI PYALI SAELASFKPQGIYIWARDAI^KWWGFFAIWMOWFHNMTWYPAVLAFIA 
ST I VYK I N P ELAHNKVY I ATV I LAG FW I LTF FN FLG I T S 3 ALFS S I CV I IGTL I PGV I LV 
SLALFW I FSGNP IA I SLSWGNLLPNFSNVSSLVLLAGMLLALCGLEANANLASDMVNPRK 
NYPKAVFIGAIATLTIL^GSLSIAIVIPKEEISLVSGL'/KTFTLFFDKYNLSWMTGIW 
VMT I AG SLGELNAWMFAGTKGLF I STQNDCLPRLFKKVI 13 KNVPTNLMLFQG I WT I FTL 
LFLCLDSADLVYWI LTALSVQMYLAMY ICLFLAGPILRI KEPRAQRLYSVPGKFLG ICTM 
SILGILSCAFALWVSFLPPRELAQISEGSKIGYTTFLLLAFSLNCLIPFGIYFTHKRLSK 
KS 

CPn_0293 31S581 317532 

No robust homolog present in Genebank/EMBL as ot 1 1/7/^8 
GRRL3YFQDLIKNAVAKIISFRKSPPNPVKLLIKFAKKDLKNSS IAPLYEVLLEILEAPG 
EE I LEVLFSLDPMWLKSMLDPKKHSTLG I E I SSETAET I ESCSLGL t S INLLLSGLCLRS 
SHDRGOAVKIIQQFCPOFSSEEVONFVEORNILTPFLHHLFECDEVALLNQLfJLRLDLIV 
PNALYPEPDPSCW0:'INSEDCAKDAEDQ0EDFNKTKEACKEGLKKLVLPAL3ITSIPQLL 
RARRFKQGAEILMA[PRKKMKQNPr I PLEALLCJ CCP^ [ .T/GKYLKLtiMT IHLWDKLLMA 
rYUJYFOTGLICOGRIETF^RRANIjNPEAFQAArQQGr-'LL^FLFPKMLLP 

CPn_02"i4 U'10'j4 JIH">5 1 

No r^cn-.r homo'^v! pnvont in O-nf.-b-jrik/f-MbL .u, <ir LL/7/'>K 
rL[^rirPAPOvpvriMM^TsiNT:;r;YGU';LK:::;Lpp[TYLiLAtr^tATLMSvi,YrrGr e:; 

V( :TFVI/;MLIPL,:;vr':VLrVAYLrY0fjr;3 rEKTKVFniT'T'IIVFI'^nnniJJLLLi'REKD:; 
V::A[DCLLKNFPA[)PKRRrKMtJ'Y:;NFLDE(Jf;pPNR«IPKED:iHT':KI I. 

ri'n.o/H'j t.Mi.ui ii'KiM 

No rfibusr homo Km pi, ...,.„, 1M < ■nf.-rt.ink /f- MP I . r; ot LL///'»H 
KKLKNLFFFT^\NKFtTV:fIia.IYI'KfK J ';F';i.:;p'/TIUJJ.Ar^VLI.l < U:VVt-AfM^-f(VI. 
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AAPLGLLVWGCAASVC3MMA IVSLMCLYKGGKPLIEPSNEEIUDPTfCDLEIKDPESLKPV 
PVEGQSLPK ERKTVSFKAK I PS £VEDDFKPWIQSTFYHQNKVY3KPIAERMQSLEKEIT 
TLIVDFPRALEEGSK3SGSLLRCVI3EIKNLFLPRFLGRKVKY3LTACLPRLGSIVEEYA 
SS DLL I LLLTK P E PLNMVTQQ L I AHLNS LKTEKRKLT PHMQK LVLS I NFWFYGWSLEEKC 
r EK IVAYDPNLLTDELKAHLEAGN rVQFLLSFQSSEMQREFRALFPSDAQELPSAKD3SN 
WPAINSSEYMYDFKDLSVLKKSLSERLAFCEKIPSPSSWNFTSSVASHYKDFSLLFTFF 
SNQQSVI LQNPF LL I ELLH ENPKCQT FLKGLLEKAM PMSNWAALFR PMLMGMLC SG I ARK 
KELKI I AEHLGVPFKEITQA IASGKILDLLLQHLFDF 

mgtE-Mg+* Transporter (CBS Domain) 

SCRESKGKIMVGEQNRNEEKLDTAFSSGNLMDSRTSHLDDELSFKLEKAFTCLSTDIHSH 
DLSKIVI EYNPI DLAYAVSCLPSESRAILYKNLSC ITAKVAF I INTDSASRWAI FRRLSD 
SEVCAL IEQMPPDEAVWVLDDI PDRRYRR ILELIDSKKALKI RDLQKHGRNTAGRLtfTNE 
FFAFLMETTVKDVSAC I RS NPG I DLT RLVFVL D FKG ELOGWTDRS L I INPPEMSLKQIM 
NQ I EHKVLPDATREEWDLVERYK I AALPWDEENFL IGAI TYEDWEA I ED I ADETT AR 
MAGTTEDVGYQTCHWQRFtXRAPWLLVTLFAGLISASVMAYFQKI SPALLALI IFFIPL 
INGMSGNVGVQCSTILVRSMATGTLSFGRRRET IFKEMS IGLLTGWLG ILCGLWYLMG 
FLGLNI FSGGG IQLGVTVATGVXiGAS LTATTLGVLS PFFFAKLGVD PALASG P I VT ALND 
IMSMI IFFLI AGGINFLFFN 

CPn_0287 324230 322089 

No robust homolog present in Genebank/EMBL as of 11/7/98 
RRCMIRSPLPFISSKRALNMLGIX2DEFSCPEDVVDFLFSEI^ 

LLMMTHNHPKWKRVI FYGVSYGLKHKSMS I F IDVLTYIDFLFEKLG ISASDRLSLCSAR 
TC INFELYS QTG EMKFLSEWDNFRL I EQLLKMH PQLKNRLGWEHFR IGAKQ EEVSLVAS 
AS VYQAVGRS F I ELYH KH L ELS BLACGMKC LALALDLS PNNAH I HADYAKG LWLGT RQG 
K S LL I ERGM EH F S KAI F LS FS RDGDTLAY QNY RYSY ALASVKL F DLTYKKEH FDQAMNT L 
YQWQAFPNLSGLWMVWGELLIRSGWLtf SNMKYI EVGLEKLASLQKKTNDP I ALSGLLAT 
GIAI LGLYL EE PNLFKDS RHRL IS AMRTF PGNSALVHALGWQLC SALYFNEDS HFASAI 
SCFQSCLEWDLDATGMWQKLFDAYFSWGIKKKSAi^LP^ 

RG LALKCLAEAT IDEAYKEIFLSESLLHYQRAWDLSGRLEILELWGQSHYLLAELQQSLF 
HYDEAYTLLTECVDLTLSS SRVKLI LAAVLLGKGRLI^DTDPAEEARE I LEPLVEVYLEDE 
NF LLLLGKVYL F LFWKNKNVCLGKLARTYL EKAT S LGC P EAYYTLGKFYAVI KDVNKAW3 
MVIRSAQYGVRITEAKWLNDPYLANLREIHAFREVVE^KGRLWD3 

CPn_0288 325785 324571 

CT288 hypothetical protein 

IS IT IREFLFFGFECRAKFYNVIMSCFNLTSTNESLRPI SPKASFPKQGWQSYFRSALRK 
HBSDTLSVSVCKVNKYDANLFVRJLTVIAIAWG^ 

I&&PTILLTGGMY ILHRLGKKVDVI SGVC IPPFSRRCWFI SSSHTLEKFDEKHVSACSY 
LDISTLSADGSGIAAVYQCPPLLFRAFPCFGI PCAMPFVALLRMIYNLIRFLWPFYI IF 
RMlYEHFFCKHLPEDDRFIYKDVARE^RSIAAFLKAPFYASACMIGAFYSLI^ 
LMSSVE RDWNDNV I LARSVSLANEAH S LFRF EGGGG RKG LGQHAFY LMLC C Q PQSVFLF D 
KGEIVSGAHPSIQLPERRGLDTSGRYPHISVIPDSGNDSAKNFIV 

C$£o289 325797 326996 

C||^9 hypothetical protein 

NF^NRLMKKQRSHYKKNNLLLLLS ILVGI^LGSVQSPWIVYSAECI ANTFLKFLRLLS IPL 
V^U^STITSIQNFmWTLGKRILYYTIXTTVIAAS IGLLLFFLLRPQMITQDALATT 
Tlk^PLGYLDVLSDTLPENIFKPFLQ^3NVISAACLAVL^ 
F&JjFLNIARGGLKIiPIAMLGFSVILFKELKD^ 
IJJ^INKVSPLKVAKAMSPALVTAFFSKSSAATXPLTMEIJ^^ 

vMmNGCAAFILITVLFVATSNGMI I SPLMSLGWIFI ATLAAIGNAGVPMGCYFLTLSLL 
TSMNVPLS I LGLILPFYTVI DMIETSLNVWSDCCWSLAN 

0^0290 327027 329523 

-dependent Transporter 
RSA&TMNKKHASFSSRLGFIFSMIG IAVGAGNIWRFPRVAAQNGGGAFL ILWLCFLFLWS 
I PLlIIELS IGKLTKKAPIGALIKTAGKKFAWAGGFITLVTTCILAYYSTIVGWGLSYFY 
YAMSGK IHLGNDFAKLWTSHYOSS I PLWAHLTSLGLAYLVIRKG I VHGI EKCNKIL I PAF 
FLCTIALLLRAVTLFGAVQG IKQLFSCDKSCFSNYKVWI EALTQNAWDTGAGWGLLLVYA 
GF^SKKTGWSNGALTAICNNLVSLIMGI 1 1 FSTCASLD ILGTTQLQDGAGASS IGITF I 
YL PELFTRL PGGIYLTTLF SSI F FLAFSMAALS SM I SMLFLLSQTLAEFG IKPYI SETLA 
T f ^AFVLG I PSALSLTF FSNQDTVWGVAL IVNGL I F I YAALVYG FPKLKKEVINAAPGDL 
RLNKAFDYI IKYLLPI EGI LLLGWYFYEGLFPENGQWWNP I SLYSLGSLVLQWSLGLI IL 
WK&JKQLYLRFSRYNHEIL 

CPn_0291 328658 329194 

incB-Inclusion Membrane Protein B 

EKHMSAPI PTPQELSDQ ITCLNVQYQQVS ELARENKGD I EGLKTLTAALTADAGIQPSAD 
EIYSLQTAAALI LSASEKPGSGPSGSTEGSVTVQSPCKFKKVLAWLT I IALIAIAVLIA 
C I IAACGGFPLLLSALNLYTIGACVSLP I IASTSVALICLCTFVANSLIKPVITVRTTR 

CPn_0292 329201 329636 

incC- Inclusion Membrane Protein C 

VKNTKNSDFMTS P I PFQSSGDASFLAEQPQQLPSTS ESQLVTQLLTMMKHTQALSETVLQ 
QQ RDRL PTA S 1 1 LQVGGAPTGGAGAP FQ PG PADDHH H P I P P P WPAQ I ETE I TT I R S ELQ 
U^RSTLQQSTKGARTGVLVVTAILMTISLLAI III ILAVLGFTGVLPQVALLMQGETNLI 
WAMVSGS I ICFIALIGTLGLILTNKNTPLPAS 

CPn_0293 329940 332723 

CT234 hypothetical protein 

VWSMQRVLRLLFNLHHGEEKRAFLFFLLGLVWGIGCYGTLSLAEGLFIEKLGSAELPKIY 
LCSSLILCVLSSLILYNLFKKHISATALFLIPVSLSILCNFYLILSSIFAIDPPRSPLFF 
YR-I V I WS LT I LS YTS FWGF VDQ FFNLQDG KRH FC I F MA 1 1 FLG DA I G SG 1 1 ASLVHT I G I 
QG I L I L FT AALV LT F P I VF YVS KS LK S LSD DH DLF I DTG H P P PLS KALK LC F YDKYTFY L 
LCFYFLMQLLAIATEFNYLKIFEIQFASKEEFELVAH IGKCSLWISLGNMCFALFAYSRI 
VK R LGVNN 1 1 L F A P LC F LS LF L FWT F KTT LS I AVLAMWR EGVTY ALDDNNLQL L I YGV P 
NK IRNQ ER I WESF I EPIGMLWSL ICFL3GQQYVFCL t ISLIVT I LVCLVR3YYAKAIL 
KNLSAQAL0LTR3M0DWIKSMTVKQKR0VELFLLAHLKHPSERH0TFAFQHLLNLASRSV 
LPSLLAHMNKL3LPNKLKT IEMVKSGLWAKDFLTLELLKRWTS [ FPHPA IA3AIHLYFAE 
MDL.LH ETH [AEDLYDTVGDRLLAAILTVRRCfEAYGPYRDLADKRLKELLNSDQPEDtVMG 
LT[LKLEKNP0NFPrLLDFLNTKNEDILIVTrKALHTr;VRANIiKrYCrELLKRLRQC3HN 
DKA:J0YL,LKTT:UALDr:;FVKDLLMTT::OLKNTJRKYA[;-:AM[i.:KLDKEVAPAFL0VLTDE 
r;THNRCR EL/\AKALCK IDNIWLLKKUAYKIVKiJKAlJKALFYnYMC.HY IQKKYPTYNL3LLA 
NT[ ^J^NYYAKVNFML^LL.G ILGSMEHSGVL IRALT3KNQK IKAOALESLEiaiCDSHLFSL 
Lb!PPyN0H;MCYSEKYYFKCGVIPLTLKELLNMM[^SP:;:JLNK[:rA00LKEEL3YCDPDF 
Q3WrrrYNQKHEDFRTEE3ETLr^KLS E 

CPx\jrA'}A i ) (077 IMS02 
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CAMP -Dependent Protein Kinase RequLarory Jubumt 
LRNFFMNL I DRAFLLKKT 1 1 FQSLDMDLLLT I ADKTET I I FK PGSNV^S ICQ PGFS FY 1 1 
VEGYIT rSKEKLE3PLNLKPLDCFGEESLFNNKPREYNA3/\NTQVRMLVLSKGQ ILNIVE 
ECPSVAL3FLELYAKQIKFREP . 

CPn_0295 333866 333627 

acpP-Acyl Carrier Protein 

AMSLEDDVIAI IVEQLGVDPKEVNENSSF I EDLNADSLDLTELIMTLEEKFAFEISEEDA 



CPn_02l't) jj4/JJ JJl^J 

CT296 hypothetical protein 

KIPIRGMICMDITLVGKKVIVTGGSRGIGI^IVKLFLENGADVEIWGLKEERGQAVIESL 
TGLOTEVSFARVDVSH>KXrWDCVQKFLDKHN^^ 

I STNLTSLYYTC S SVI RHM I KARSGS 1 1 NVAS I VAK I GS AGQTNYAAAKAG 1 1 AFTKS LA 
KEVAARNIRVNCLAPGFI ETDMT SVLNDN LKA EWLK S I P LG RAGT P EDVARVALF LASQ L 
SSYMTAQTLWDGGLTY 

CPn_Q297 335724 334774 

fabD-Malonyl Acyl Carrier Transcyclase 

S RSNKDDNFMKKRYAF LFPGQGSQYVGMGQDLYMEY PEVREL FDF ANER LGF SLTS I MF E 
GPEDLIJ4ETVHSGLAIYIJ1SMAWKVI£QRS^^ 

LELVRKRGQU^EACNQSPGAMAALIX3LPSEVIEEKITSLGCGIWIA2miAPKQLVVAGI 
AEKVDQAIELFRDLGCKKAVFiKVSGAFHTPLMCVAQDGI^APDIYALCMKDSSLPLVSHV 
VGKSLVNTEEMRECLARQMTSPTLWYQSCYHIESEVDEFLEI^PGKVLAGLNRSIGISKP 
ITSLGTFAQIEKFLSEV 

CPn_0298 336742 335717 

fabH-Oxoacyl Carrier Protein Synthase III 
YTSFFLYMWFSVNKrnCKAAIWATGSYLPEKVLSNADLEKMVOTSDEWIVTRTC 
GPQEYTSLMGAIAAEKAIANAGLSKDQIDC 1 1 FSTAAPDYI FPSSGALAQAHLG I EDVPT 
FDCQAACTG YLYG LSVAKAYVE SGTYNHVLL I AADKL S S FVD YTDRNTCVL FG DGGAACV 
IGESRPGSLEINPiSLGADGKLCEIiSLPAGGSRCPASKETLOSGKHFIAMEGKEVFKHA 
VRRMETAAKHSIALAGIQEEDIDWFVPHQA^RIIDAIAKR^ 
ASSVGIALDELVHTES IKLDDYLLLVAFGGGLSWGAWLKQV 

CPn_0299 336726 337415 

recR-Recombination Protein 

RKKLVYYS ES LYSNLNLGPRPECKNK I H I TMTRYPDYLS KL I FFLRKLPG IG FKTAEKLA 
FELISWDSEQLKILGNAFHNVASERSHCPLCFTLKESKEADCHFCREERDNOSLCIVASP 
KDVFFLERSKVFKGRYHVLGSLLSP ITGKH I ENERLS ILKSR I ETLC PKEI I LAIDATLE 
GDATALFLKQELQHFSVN I SRLAIiGL P IGLS FDYVDSGTLARAFSGRHSY 

CPn_0300 337768 340152 

yaeT-Omp85 Analog 

GPvLLGMLIMiy^IKVIU5ISII^ICTPLTLFSTEKVKHX3HV^^ 

PKIJCTRSGALFSQLDFDEDIJIILAKEYDSVEPKV^FSEGKTNIALHLIAKPSIRNIHISG 
NQVVPEHKIUCTIX3IYRNDLFEREKF3^GLDDLRTYYLKRGYFASSVDYSLEHNQEKGHI 
DVLIKINEGPCGKIKQLTFSGISRSEKSDIQEFIQTKQHSTTTSWTGAGLYHPDIVEQD 
SLAITNYLHN13G YADAIVNSHYDLDDKGN I LLYMD I DRG SRYTLGHVHI QG F EVLPKRL I 
EKQSQVGPNDLYCPDK IWDGAHKI KQTYAKYGYINTNVDVLF I PHATRP I YDVTYEVSEG 
SPYKVGLI K ITC^TTHTKSDVILHETSLF PGDT FNKLKLEDT EQRLR 
SQLDPMGNADQYRDIFVEVKETTTGNLGLFLGFSSLDNLFGGIE^ 

KGFRCLRGGGEHLFLKANFGDKVTDYTLKWTKPHFLNTPWILG I ELDKS INRALSKDYAV 
OT YGGNVSTTY ILNEHLKYGLFYRGSCTS LH EKRKF LLG PN I DS NKG FVSAAGVNLNYDS 
VDSPRTPTTGIRGGVTFEVSGLGGTYHFTKLSLNSS I YRKLTRKG I LKI KGEAQFIKPYS 
NTTAEGVPVSERFFLGGETTVRGYKSFI IGPKYSATEPQGGLSSLL r SEEFQYPLIRQPN 
ISAFVFLDSGFVGLQEYKI SLKDLRS SAG FGLRFDVMNNVPVMLG FGWP FRPTETLNGEK 
IDVSQRFFFALGGMF 

CPn_0301 340163 340762 

{OmpH-Like Outer Membrane Protein) 

IKDLSKEIFWFRKGFWYPFS I PKLVQVT MKKLLFS TFLLVLG STSAAH ANLGYVNLKRC 
LEESDLGKKETEELEAMKQOFVKNAEKI EEELTS IYNKLQDEDYMESLSDSASEELRKKF 
EDLSGEYNAYQSQYYQS INQSNVKRIQKL IQEVKIAAESVRSKEKLEAILNEEAVLAIAP 
GTDKTTE I IAI LNESFKKQN 

CPn_0302 340766 341866 

lpxD-UDP Glucosamine N-Acyl transferase 

SKFKEFSMSEAPVYTLKQLAELLQVEVQGNIETPISGVEDISQAQPHHIAFLDNEKYSSF 
LKNTKAGAI I LSRSQAMQHAHLKKNFL ITNES PSLTFQKC I ELF I EPVTSGFPGIHPTAV 
IHPTAR IEKNVT I EPYWI SQHAH IGSDTY IGAGSV IGAHSVLGANCL I HPKWIRERVL 
MGNRVWQPGAVLGSCGFGY ITNAFGHHKPLKHLGYVI VGDDVE IGANTT IDRGRFKNTV 
IHEGTK IDNQVQ VAHHVE I GK H S 1 1 VAQAG I AG STK IG EHV I IGGQTG ITGHI SIADHVI 
MIAQTGVTKSITSPGIYGGAPARPYQETHRLIAKIRNLPKTEERLSKLEKQVRDLSTPSL 
AEIPSEI 

CPn_0303 342982 341^21 

CT303 hypothetical protein 

REQKGLHHMDVSRKINRHTQFYVDSIDGVIKNFDHKPSEDKSRDHEELEEKLLTITKRIV 
ASAQEFQNRKTD3KNYYLKKTQWLPFKNEELEQTKELFAMLTSMDKKIAQLFFYSPGCSS 
DWVEFT EV ICHLNDS IGLGGVLLCCGLFEQQC EHVVTVNKKLDLPLLLGTTWNSLRYYL 
TYRNISLLNCQSMSELGKELGDVLKQHGVAFTLIFKEIVDIDLIJslYVKLIQGLKRSGNIQ 
ARIYDNDVPTLPSVSSSPIALRYSLANTIRGLALHVDFSSLKFISPSILSOTEHTAKALN 
3GGECF I FSNLDEFNLGMK I VMQLLRTGK ISPEI LNKN IMK ILMI KRRVRSLY I 

CPn_0j04 M1091 344L54 

pdhA/odpA- Pyruvarp Dehydroq^nanf- Alpha 

DQKPLPKRLPYKKVMDSSAPYN I AiiO/rrEK'TTVER I LDLYGPASC t KFLKQMVLIREFEA 
RG E FAY LEG L V< *G FY N I Y ACQ E AV AT AA [ ANT<* ; LDPWVF 3 S Y PC H A LA [ L LN I P LQE I AA 
ELL/IK ET^rAUlRGGSMHMrGPNFPf ;r,F<J IVf 7VJ I PLAAGAArV [KYQF.QKNRV3LCFIG 
ir,AVA(J ;VFHrTLNFV3LH0LPLMLr UXM JW.^MnTSt.NRAVAKOP I AE3Q(i3SYDIRAV 
TVNt IT* Df J* N' '[,[ aTFRFAYRYMVDTFSI'VI .VVX l,( '^RFRCUr* £ SDFNLVRSKF.KMOCLFKK 

rji'Evr^KrjwLrRLFVt/rrFFFOMrRoncKTAVt.nAFSNAKLrj^np^v'rTr.Fr.r^WA 

CPn_ f ) iV> \AA 1 A2. 14 M : / 

f^ihB/odpR -iyr nv.ir ^ Dahydro'ji Tia- » in-r.i 

l<KL3Mt'Kf IKTI ,1'. [KCALRFA I f jEEM.'JPDI'NV' 1 I / \r.EV< IDYf l f '.A\ KVTKf lUAMWlPKRV 
£DAP[. f JKAAF:X; IC rriAAl./AJLI'l't [ F:i-M: !Wt Jl -^'VALI^J l'[ SIIAAKMlil'MTi X;KF3VP E 
VFRf' PNfiAAAOViVUWJtii'Vl TAX AN rrtir.r I I AI";rH'VDAKM,i.K::AI RNNNf'VLFLEN 
KLIIYNt.K^EVrrEliYI.VPEv^KAilPVOryiNin/I'r ITY'IIWirTKI-'AfM.AKKPWfJLUICI 



86 



[DLRT [K PLUrSTILS^VRKTSRC I VI EEGHY FAG I SSEI IALITEHVFDSLDAPPLRVC 
QKETPMPV:;K [LEQATLPNfVNRILDTIEKVMR 

CPn_03O6 34511b 346431 

pdhC-Dihydrolipoamide Acetyl transferase 
OKFVISLLKMPKLSPTMEVCTTIVKWHKKSNDCjVSFXIIDVIV^ 

ErLRHEGEKIVrCTPrAVLSTEANEPFNLEELLPKTEPSNLEASPKGSSEEVSPATTPQA 
ASATFTAVTFKP EP PLSS PLVFKHVGTTNNL3PLARQLAKEKNI DVSS IQGSGPGGR IVK 

km .;-::-\-\ i ■ ; r : ■ : a- ;r ; v ; 1 * i ; vp t r v; 'T f ECN r / " r ~v r aap r a * * • p»*"V7pccvv 
A.:r!,!.:!r,[,i'r:u , 'V:r!':, 'itn- rvpA'*A[.Aryn r-*!Mr/;r N-,vr,f ^:*/»rrc[or.'iA 

VA I PDG 1 1 T P 1 1 RCADRKNLGMI SAEIKS LALKAJVJQSLQDTEYKGGo FCVSNLGMTG IT 
EFTAI VNPPQAAILAVGSVTEQALVLDGE ITIGSTCNLTLSVDHRV IDGYPAAMFMKRLQ 
KILEAPAVLLLN 

CPn_0307 348998 346515 

glgP -Glycogen Phosphorylase 

NGC I VE D F S S FDKNKVSVDSMKRA I LDRL YLS WQ £ PES AS PRD I FTA VAKTVMEWLAKG 
WL KTQNGYY KNDVKRVYYL SMEFLLG RS L KSNLLNLG I L DLVRKAL KTLNYDF DHLVEME 
SDAGLGNGG LG RLAAC YLD SMATLA V P AYGY G I RYDYG I FDQR IVNGYQ EEAPDEWLRYG 
NPWE ICRGEYLYFVRFYGRVIHYTDS RGKQVADLVDTQEVLAMAYD I P I PGYGNDTVNSL 
RLWQAQS PRGFEFSYFNHGNYIQAIEDIALI ENI SRVLYPNDS ITEGQELRLKQEYFLVS 
ATIQDIIRRYTKTHIC LDNLAD KWVQLNDTH PALG I AEMMH I LVDREEL PWDKAWEMTT 
VIF^^r^^^riTILPEAI,ERWPLDLFSKLLPRHLEIIYEINSRWLEKVGSRYPKNDDKRRSLS 
■IVEEGYQKRINMANLAVVGSAKVNGVSSFHSQL'IKDTLFK^^ 

RWIALCNPRLSKLLNETIGDRYIIDLSHLSLIRSFAEDSGFRDHWKGVKLKNKQDLTSRI 
YNEVGE I VD PNS LFDC H IKR I H EYKRQI^INILRVI YVY^LKENPNQEA^/PTTVI FSGKA 
APGYVMAKL 1 1 KL INSVADWNQDSR VNDKLKVL FL PNYRVSMAEH 1 1 PGTDLS EQ I ST A 
GMEA SGTGNMK FALNGALT IGTMDGANI EMAEH IGKENMFIFGLLEEQ IVQLRREYCPQT 
ICDKNPKIRQVLDLLEC^FF^SNDKDLFKPIVHRIX,HEGDPFFVIJ^LESYIAAHEN\^K 
LFKEPDSWT K I S I YNTAGMGF FS SDRAI QDYARDIWHVPTKSCSGEGN 

CPn_0308 349213 349596 

No robust homolog present m Genebank/EMBL as of 11/7/98 

F FTQENNMATVAQT PQTTQ PQ P SVS K KATHRYCSWVFF K P I LVSLGLLLAS LTTLGLV I A 
SGVTLSLGI GIVLAIQ IVLAG I ALVLAFNH I RQFKQARTAELNSMKM IS APAAATVQKQK 

LEDRYSSK 

CPn_0309 350977 349595 

CT309 hypothetical protein 

FMRAWE E FLL LQEKE I GTNTVDKWLR SLKVLC FDACNLYLEAQDS FQ I TWFEEH I RHKVK 
SG^^^NNNKPIRVHVTSVDKAAPFYKEKQMC^FJCT 

DllEfRVTX3EFTKSPDENGGVTFNPrYLFGPEGSGKTHI^lQSAISVLRESGGKILYVSSDL 

FT EHLVSA I RSGEMQKFRS FYRNI DALF I EDI EVFSGKSATQEEFFHTFNSLHSEGKL XV 

VS^YAPVDLVAVEDRLISRFEWGVAIPIHPLVQEGIJ^FI24RQVERLSIRIQETALDFL 

I YffiSS NVKTLLHALNIiAKRVMYKKLSHQL^ IR 

NVSCJYYGVSQESILGRSQSRETYVLPRQVA 

Ll|EgKI EENSHDIHMAIQDI SKNLNSLHKSLEFFPSEEMI I 

Cp|Qo310 353472 351049 

60ilM-60kDa Inner Membrane Protein 

YF|BEiL SL I FRVYQMNKRTLLFVSL IG I AFVGC Q I FFGYNEFRSCKNLAEKQRKI SEQTLA 
AV^^GLSVASWOTDVTTCEEHKN^AVRVGDKLFIiHNGEAAQSVYS SGESWSFVDHKCG 
FD^HLALYRCCGSSFNPTfJTGKWLPTNHEGLPV^ 

KDSpp FGT AL VFWRSG S DY I PLGLYD S REEKL VSLDLP I T RAVI FGNDQ D SAKS S DTANH 
YVLFNDYMQ I IVSEESGS IEGINLPFASTNNKS I VNEIGFDRDLASEKS PEALFPGLSSK 
LPDGQQAKNSIGGYYPLLRRGLLSDSKKLLPLEYHAIi^SGREI^TPVAlJlYRVLSYTP 
HSlQLESLDRSVQKVYKL PENPEEKPYVF ETA ITLTKETEDVWVTSGVP EVE IMSNAS AP 
TLiBQgRVI KKNKGSLDKVKL PKVKEPLA I RRGVYPQW I LNSNGYFG 1 1 LT PLS E I ASGYGS 
LYJSGSTAPTRLSAI S PKNQLYPVSKYPGYETLLPLPKDAGTHRFLVYAGPLAEPTLKVL 
DKtir^TQEKGENPEYLDSISFRGVFAF ITAPFAALLF I IMKFFKLVTGSWG I S I ILLTVFL 
KLLLYPLNAWS I RSMRRMQ ILSPYIQQ IC^KYKNEPKi^QMEIMGLYKTNKVNPITGCLP 
LLpjLPFLI AMFDLLKSSFLLRGASFI PGWIDNLTAPDVLFSWQTS IWF IGNEFHLLPI L 
I^IVMFLG^KVTSLHKKGPVTDC^KCX&VMGNMMAILFTAMFY^ 
W^VITNKILDSKHLKNEVVIJ^NKICHR 

CPnUhll 354453 353575 

CT33rl hypothetical protein 

emr^Mmaviywdp^kivwsfepwslrltwgvfftvgiflaclsarylalsyyglkdhls 
fsksqlrvalenffiysilfivpgarlayvifygwsfylqhpeeiiqiwhgglsshggvl 
gfllwaaifswiykkkiskltflfltdlcgsvfgiaaffirlgnfwnqeivgtptslpwg 

WFSDPMQGVQGVPVH PVQLYEG I SYLWSG I LYF LSYK RYLHLGKGYVTS I AC I SVAF I 
RFFAEYVKSHQGKVLAEIXLLTIGQ I LS I PLFLFGVALL I ICSLKARRHRSH I 

CPn_0312 354518 354976 

CT101 hypothetical protein 

CTMARNIKYFLILFPG ILWISAGMKLLLKATA IALDPLSSFFTYCLLSMVSWGLASLKHR 
YLLSKTIRKQLSLSSEFFSQKITWIAYIKQTFISRRFLIMVIMIAFSLVLRRYISNPQAL 
FV I RATVG YAL I KT A I AY FS KLQN ALMEN PEGN 

CPn_0313 354957 355355 

acpS-Acyl-camer Protein Synthase 

WK ILKEISANSME I IH IGTDI IEISR I REAI ATHGNRLLNR I FTEAEQK YCLEKTDPI PS 
FAGR FAGKEAVAKALGTG IGSWAWK D I EVFKVSHG PEVLLPSHVYAXIG I S KV ILS I SH 
CKEYATATA I ALA 

CPn_0314 356285 355353 

t rxB-Thioredoxin Reductase 

M I HSRL 1 1 IGSG PSGYTAA I YASRALLHPLLFECFFSG I SGGQLMTTTEVENFPGFPEG I 
LGPKLMNNMKEQAVRFGTKTLAODI ISVDFSVRPFILKSKEETYSCDACI IATGASAKRL 
E I PCACINDEFWQKCVTACAVCDGAS P I FKNKDLYV IGOCDS ALEEAL YLTRYGSHVYWH 
R R 0 KLR A: i K AM EARAQ EsINEKITF LWN S E E VK I SOD?, I V R ' }VD I KNVQTQ E I TTR EAAGV F 
FA ICE 1 K PNTDFLf X3QLTLDESGY r VTEKGTSKTSVPOVF AAGDVODKYYRQAVTSAGSGC 
E AALDAFRFLG 

''t-n_D J ] '» l r >i ; 077 iSH7tti 

f.'-l SI Hi tx )::()iit.i L Protein 

Ml'KQAF.YTW< ^ 'KK 1 1 .DNI ECLTEDVAEFKDLLYTAHP IT^'JEEEl'PNFIOPGA ELKGTW 
|itNK[Jl-'WVW;LK::E( IVIf'MSEFIDSnFflLVUJAEVrVYLDOAFDFEoKWLrjREKATR 
<jR<jWKY E LAIK :i\FS 1 1 VKC'QTTRKVKGC^L [ VDIi IMEAFLPr JHQ [ DNKK E KNLDDYVGKVr 
KFK I [jK FNVKRRN [ WSRHELLEAER L'.KKAEL [ EQ HZ Tf JEYRKtlWKN rTDFGVFLDLD 
' '• I nOLI *' I ITDMTWKR f RIIP'JEMVEr JMOELEV I I L JVDKEKCIRVAUl C.kOKEHNPWED £ EK 
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KY PPGKRVLGK I VKLL PYGAF I E I EEC I EG L I H IS EMSWVKN [ VDPS EWNKGDEVEAI V 
LS IQKDEGK ISLGLKQTERNPWDN I EEKYP tGLHVNAE I KNLTNYGAFVELE PG I EGLIH 
1 1 DMSW I KKVS HP3ELFK KGNSVEAV r LSVDK ESKK rTLGVXQLSSNPWNEI EAMFPAGT 
V I SGWTK I TAFGAFVELQNG I EGL I HVS ELS DKPFAK LED I IS ICENVSAKVIKLDPDH 
KKVSLSVKE*fLADNAYDQDSRTELDFKDSQGPKERKKKGK 

CPn_03l6 359784 360121 

nusA-N Utilization Protein A 

■r~~x- ■ •-Mrr^r;*. .**■ r " r,t MrK:r? r?prr r T~\ r ^r\;y rAA^y™: p*"DA:r:~ ( 'v 

rn -wr.;p?"Vi "FK-r'/F r '..tip :k- :r',; .'\reydi ^^uwmvr'rf-,DNFC.R:\\ t i 

AARUII^KLRHAERLVIYLL: KiiKVNLTLJGVVKKf- AKJSNLI IDLCKVEAILPTRF^ P 
KT EKH K IGDK I YALLY EVQ £S ENGGAEV I LS R SHAEFVKQLF IOEVP ELEEGSVEI VK I A 
R EAGYRTKLAVR SSDPKTD PVGAFVGMRGSRVKN 1 1 RELNDEKI D I VNY SPVSTELLONL 
LYPIEIQKIAILEXiDKVIAIWNDADYAWIGKRGINARLISHILDYEILEVQRMSEYNKL 
LE IQRLOLAEFDS PHLDQPLEMEG I S KLV I QNLEHAGYDT I RRVLLAS ANDLASVPG I S L 
ELAYKILEQVSKYGESKVDEKPEIED 

CPn_0317 360045 362750 

inf B-Init lation Factor-2 

SLLIRSLSKSA2^EKVXLTKNLKIJ<IKNAQLTKAAGLX»KLKQKIA 

AKEKSVKVAIJVATSTPTASAEQASPESTSRRIRAKNRSSFSSSEEESSAHrPVDTSEPAP 
VS IADPEPELEWDEVCDESPEVHPVAEVLPEQPVLPETPPQEKELEPKPVKPAEPKSVV' 
M I KSKFG PTGKH INHLLAKTFKAPAKEEKVVAGSKSTKPVASDKTGKPGTS EGGEQNNRE 
KQFNPANRS PASGPKRDAGKKNLTDFRDRSKKSDESLKAFTGRDRYGLNEGG EEDRWRKK 
RVYK PKKHYDEAS IQR PTH IK I SL P I TVKDLAAEMKLKAS EVIQKLF I HGMTYWNDI LD 
SETAVQFIGLEFGCTIDIDYSEQDKLCLSNDTVRI3EIQSTDPSKLVIRSPIVAFMGHVDH 
GKTTL IDS LRKSNVAAT EAGA I TQHMGAFC C ST PVG D I T I LDT PG H EAFS AM RARG AEVC 
DIVVLWAGDEGIKECTIoEAIEHAKAADIAIWAINKCDKPNFNSETIYRQLSEINLLPE 
AWGGSTVTVNTSAKTGEGLSEXLEMLAI^AEVL 
TVLIQNGSLKLGEALVFNIXrYGKVKTMHNEHNEIJ^EA^ 

VVKNEKTARD 1 1 EARSAGC^PxFALCXJKKRP^DSMLQhHCKTLKLM I KADVQGSI EALVS S 
ISKIKSEKVDVEILTNSVGEISESDIPJ^AAASKA\0,IGF 

FTVIYHAIDAIKEIMTSLLDPIAEEKDEGSAEIKEI FRSSQVGS rYGCIVTEGIMTRNHK 
VRVLRNKE I LWKGTLS SLKR VKEDVKEVRKG L ECG I LL EGY OQ AQ I GDVLQC Y EV I YH PQ 
KL 

CPn_0318 362704 363126 

rbfA-Ribosome Binding Factor A 
VMSYNVMKX.SIIHKNYNIiCYCMTENRRIK^ 

R VS L SKDLH S ARVYVS VMPH ENTKE EALEALKVS AG F IAHRASKNWLKYFPELHFYLDD 
IFS PQDYIENLLWQIQEKEKS 

CPn_0319 363133 363879 

truB-tRNA Pseudouridine Synthase 

TI FFGNLNT I KDMTMDLAVELKEG ILLVDKPQGRTS FS L I RALTKL IGVKK IGHAGTLDP 
FATGVMVML IGRKFTRI^D I LLFEI3KEYEAIAHLGTTTDSYDCDGKVVGRSKK I PSLEEV 
LSAAEYFC^EICX2LPPMFSAKKVCOKKLY£YARiCGLSIERHHSTVQVHl^ITK^ 
FWSCSKGTY IRS I AH ELGTMLGCGAYLEOLRRLRSGRFS IDEC I DGNLLDH PDFD I S P Y 
LRDAHGNSL 

CPn_0320 363824 364783 

ribF-FAD Synthase 

TTPISIFLPTYEMPMEIAYSLTSSFSVDSVTVGFFDGCHLGHSNLLSILTSYSGSSGVIT 
FDS H PQTVLSLNHTKL I NT KEE RLQLLQT F P I CWLG VLT F DLNFANQS AEEF LT LLH RNL 
KCKRLILGYDSCIGKEQQSOTEALDTIGKPLGIEVIKIPPYRMDNIWSSKAIRQFLSAG 
NL ECAHRFLGH PYAI SGK ITEGSG IGGSLGFAT INL PRE ES L I PLGVYACE I RYDSTTCQ 
GVMKI/STAPTFGRESLYAEAHIFSFAE^YGKEVSIIPRKFLREEKKFQSKETLIRArEK 
DI LDAQDWFAKGSFNYEGTA 

CPn_0321 365900 364767 

ychF-GTP Binding Protein 

YSKKHVI IF IFRCLMSHTECG I VGLPNVGKSGLFNALTGAQVASCNYPFCTI DPhTVGIVP 
VIDERLEAIJ^ISNSQKIIYADMKFVDIAGLVKGASDGAGLGNRFLSHIRETHAIAHVVR 
C FDD PDVTHVSGKVNPVED I EV INLEL I FSDFSS AKNI H SKLEKLAKGKREVGALL PLFD 
T I IAHLEKGLPLRTLELTPEQIVALKPYPFLTMKPMFY I ANVDES SLPDMDNDYVAAVRE 
VAAKENSKWPICVRIEEEIVSLPIEERLEFLMSLGI^KSGLHRLVRAAYDTLGLISYFT 
TG PQESRAWTWRGSS AWEAAGEIHTDIQKGFIRAEVITFEDMIECQGRAAARELGKLH I 
EGRDYIVQDGDTMLFLHN 

CPn_0322 366231 367328 

yscLF-YopS Translocation Protein U 

SNLGNSMGEKTEKAT PKRLRDARKXGQVAKSQDFPSAVT F I VSMFTAFSLST F FF KH LGG 
FLVSMLSQAPTRHDPVITLFYLKNCLMLILTASLPLLGAVAWGVIVGFLIVGPTFSTEV 
FKPDIKKFNPIENIKQKFKIKTLIELIKSILKIFGAALILYITLKSKVSLIIETAGVSPI 
ITAQI FKEI FYKAVTS IG I FFL IVAILDLVYQRHNFAKELKMEKFEVKQEFKDTEGNPEI 
KGRRRQIAQEIAYEDSSSQVKHASTWSNPICDIAVAIGYMPEKYKAPWI IAMGINLRAKR 
I LDEAEKYG I PIMR^A/PLAHQLLDEGKELKFIPESTYEAIGEILLYITSLNAQNP^^NKNT 
NQPDHL 

CPn_0323 367322 369460 

IcrD- Low Calcium Response D 
' SFIE^KLLNFVSRTLGGITrALM^INKSSDLILALWMMGVVIjMIIIPLPPPIVDLMITINL 
SISVFLLMVALYIPSALQLSVFPSLLLITTMFRLGINI3SSRQILLKAYAGHVIOAFGDF 
WGGNYWGF I IFLI ITI I QF I WTKGAERVAEVAAR FR LDAMPGKQMA I DA DLRAGM I D 
ATQARDKRAQ IQKESELYGAMDGAMKF I KGDV I AG I VI S L IN I VGGLT IGVAMHGMDLAQ 
AAHVYTLLS IGDGLVSO I PSLL I ALT AG IVTTRVSSDKNTNLGKEISTQLVKEPRALLLA 
GAATLGVG F F KG F PLWS F S I LAL I FVALGILLLTKKSAAGKKGGGSGASTTVGAAGDGAA 
TVGDNPDDY5LTLPV ILELGKDLSKL IQHKTKSGQS FVDDM I PKMROALYQD IG IRYPG I 
HVRTDS PSLEGYDYMILLNEV P y/RG K I P P H HV LTN EVEDNL S R YNL P F ITY KNAAGL PS 
AWV^EDAKAILEKAArKYWTPLETyilLHLSYFFEfKSSQEFLGIQEVRSMIEFMERSFPDL 

V KEVT RLIPLQKLTEIFKR LVO EQ I S [ K D L ,RT T L Eli LS EWAQT EK DTVL LTEWR S SLKL 

Y I : J FK F SQGOS A ISVt'LLDPEIE EM I RGA I KOT; JAi j S Y LAL D P DS VN L I LK S M RNT I T PT 
FA( XWPPVLLTA IDVRRYVRKL £ ETEFPD I AV I 'JYQF. ILPE I R IQPLCR IQ IF 

rpn_0 \7A UOt/riH 
t'Tj24 hyporhoriu.il prorf-m 

VWAf i PR HMAASGCTGCLGGT(y "/tILAAV FAAAAKADAA EWASOEOi; EMNM I QO^ODLT 
N^^^AAT^^'^KKKEEKFOTLESRKKGEAGKALy K^'f ::>TFCKr j r/rD[ADKYA:*GNr;EIi'GQEL 
Kl ILI'UA I ( IDOA;] PED r LALVQF.K I KOE'AL'j. :'[■ ALDY LV^'IT PPSO^KLK EAL IQARtMTHT 
LX'F^RTAU.AKN ILF-\:;0l- YADOLriVr:Pr:r :i R A. ^.FVr^ETTIITCDOLL^MLODRYTYQD 
MArV^'^-'tWKCMATELKROiIPr/P'Wjt-ijVIW'I'frrwrir/jAVtiTilYDYFEilRVPtLLDCLK 
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AEG I <"JT T l' I J I jN FVKVA EG Y H K I INDKrPTAGKVEREVRNLICDDVDSVTGVLNLFFSALR 
QTSGRLF^CAOKRQOI/^AMrANALDAWtNNEDYPKASDFPKPYPWG 

CPn_0325 370696 371148 

CT325 hypothetical pror.ein 

KRIAMQNQYE0LL£3LAPLUm , LAPDKNNSCLIRFGDTHVPVQIEEDGNSGDLAVSTLL 
GTLPENVFRERIFKAAL3VNC.jFQSS ikgii^ygevtqqlylsdilsmnylngeklfeyl 

KLFSLHAKIWMESLRTGNLPDLHVLGIYYVA 
malQ-Glucunotrjnsterase 

PSCFGNLLRRVNVLKYTKHS PSAHAWKL IGTS PKHG I YL PLFSIHTKNSCGI GEFLDL I P 
L I SWCQKOGFSVIQLLPLNDTGEDTS PYNS I SSVALNPLFLSLSSLPNI DTI PEVAKKLQ 
DMHELCSTPSVSYTQVKEKKWAFLREYYQKCCKSSLEGNSNFSEFLESERYWLYPYGTFR 
AI KHHMHGEP INTWPK3LTIX3EKFPDLTKKFHDEVLFFSYLQFLCYQQLCEVKAYADQHH 
VLI^GDLPILISKDSCDVWYFRDYFSSSRSVGAPPDLYNSEGQNWHLPIYNFSQLAKDDY 
IWV^ERIJIYAQNFYSVYRLDHIIGFFRLWIWDSSGRGRPIPDNPKDYIKQGTEILSTMLG 
ASSMLPIGEDLGI IPQDVKTTLTHLGICGTRI PRWERNWESDSAF I PLKDYNPLSVTTLS 
THDSDTFAQWWLNSPKEAKQFAKFIiiLPFQKTLTTETQIDILKI^HESASIFHINLFNDY 
LALCPDLVSKNLQRERINTPGT ISKKNWSYRVRPSLEELAIHKKFNGYI EKILTGL 

CPn_0327 372927 373211 

r!28-L28 Ribosomal Protein 

RI HRKNMSRKCPLTGKRPRRGYSYTLRGI AKKKKG IGLKVTGKTKRRFFPNMLTKRLWST 
EENRFLFCLKI SAS ALRH IDKLGLEKVLERAKSKNF 

CPn_0328 373220 374992 

CT035 hypothetical protein 

LKYRE I FMS FLRRH I SL FR SQKQL I DVFA PVS PNLELAE IHRRVI EDQG PALLFHNVI GS 
SFPVLTNIJ^GTKHRVDQLFSQAPD^IARVAHLISSTPKLSSLWKSRXJLIJCRISSLGLKK 
ARFRRFPFVSMS SVNLD HL PL LTSWP EDGGAF LTLP LVYTES PT LTT PNLGMYRVQRFNQ 
imiGLHFQIQKGGGMHLYEAEQKKQ^PVSWLSGNPFLTLSAIAPLPEWSELLFATFL 
CX3AKIXYKKT^HPHPLLYDAEFILVGESPAGKRRP£GPFGDHFGYYSLQHDFPEFHCHK 
I Y HRKD A I Y PATWGK PYQ EDFY IGNKLQ EYL S PLF PLVMPGVRRLKSYGESGFHALTAA 
WKERYWRESLTTAIJUI/SF^LSLTKFLM^ 

FS ETANDTLDYTGPSLNKGSKG I FMG IGKAI RDLPHGYQGGKIHGVQDI APFCRGCLVLE 
TSLEDRCIKSLLHHPDLKSWPLIItADNLRET IQSEKDFLWRTFTRCAPANDLHALHSHF 
ATHfPNVNFPFVI DALMKP SY PKEVEVDPSTKQKVS ERWHAYF PNKET FY I 

CPs£0329 375085 376146 

Pfejipho lipase D Super family [leader (33) peptide} 

KMJIKRQKDKLKI CVT I STL I LVG I FARAPRGDTFKT FUCS EEAI I YSNQCNEDMRKILCD 

AIEHADEEIFIJ^IYNLSEPKIG^SLTRQAQAKimrriYYQKFKIPQIUCQASNVTLVEQP 

PAGFJ<I>IHQKALSIDKKDAWI/3SANYT^ 

KfiO^KYFVLPQDRKI AIQAVL EKIQTAQKT I QVAMFALTHSE I IQALHQAKQRG IHVD I 
I S^SHSKLTFKQLRQLNrNKDFVSINTAPCTLHHKFAVIDNKTLLAGS INWSKGRFSLN 
DSSf. 1 1 LENLT KQQNQKLRM I WKDLAKH S EH PTVDD E EKE 1 1 E KSL PVE EQEAA 

c£.0330 376930 376202 

ci'S.S.3 hypothetical protein 
FJ&IEMLLLSRQLFSVLPSRFQDLHVTO 

RKEgVRKRREKNYLR I FRVLSRFDVMR 1 1 RFD PYGALSAQS I AKDSRQN SPLVEK I SEE I 
ATNEA I RLALLAI GDR EQ E E KKQRH RYKL LGQ KQAKVLL SQLRHVHLDFKKL YC DS KKKE 
DQEKDEKNKQKRS IKVTKKKKGISLGAAASQAIAAAAEAWVIARNKGVLiETASTLFYQKD 

Cfe0331 378452 376701 

Cf082 hypothetical protein 

ioir^imavsggggvqpssdpgkwnpaiiqgeoaegpsplkesifsetkqassaakqeslvr 
sgstgmyate3qinkakyrkaqdrsstspksklkgtfskmrasvqgfmsgfgsrasrvsa 
krjasdsgegtsllptemdvalkkgnri spemqgffldasgmggsssdi sqlslealkssa 
f^'g^slslsssesssvasfgsfqkaiepmseekvnawivarlggemvsslldpnvetss 
lvs^^tgnegmidlsdlgqeevstawtspravsgkvkvsssdspean 
paelzaekqesreqlsedqmmlaramaglltgaapqevlsnsvwsgpstwpppkfsgtl 
ptWsgdkskhkspgiekstnhtnfsplregtvksaevkslphpesmyrfpkdsivsree 
peawkestafknpenssqnflp iavesvfpkesgtggalgsdavs ssyhflaqrgvsll 
a plpratddykekl eahkg pgg p pdpl i yqy rnvavep p i vlrs pq p fsgss rlsvqgkp 
eaasvhddggggnsggf sg dqrrgssgqkasrq ekkgkklstdi 

CPn_0332 378676 378536 

CHLTR T2 Protein 

YLDSR I RVI PLARQRCTLLHLLAVLC PP I SFFTQGVS PCVFFCFLDF 

CPn_Q333 379117 378800 

LtuB 

VDFFVFVFFMGKPKKSRTDRALAQEIQKK5TEVLKKPARIKAKNRRKFLIAKEQKTLKHR 
AQEYDQLVRSLLDSQKKDTDKVLIFNYENGFVFTDKDHFSKYSIRL 

CPn_0334 379308 379823 

CT079 similarity 

TMSVH I T PRKC F I LC I LSMFTL PTLFPKAHL I LFS PY I VLCFYC FS KDKGLVLALGCGVL 
.^DLALGSRGVFLLLYPLTALITHKAHLI FSKESKAALVIVNMI FYGVFLLLT I PMCALFG 
HEVRWSIDVLMIPLKCSFLDNLIFTSVIYILPCAINSGIHKMISFFRRLVCY 

CPn„0335 379808 380674 

tcrlD-M^thylene Tetrahydro folate Dehydrogenase 

EIGMLLRGIPAAEKILORLKEEISQSPTSPGLAVyLIGNDPASEVYVGMKVKKATEIGII 
SKAHKLPGDSTLSCVLKLrERLNODPSIHGILVQLPLPKHLDSEVILQAISPDKDVDGLH 
PVNMGKLLLGNFDGLLFCTPAG I IELLNYYEI PLRGPHAAIVGRSNIVGKPLAALMMQKH 
POTNCTVTVLH:;QSENLPE TLKT.\DI t IAALGAPLF IKETMVAPHAVIVDVGTTRVPADN 
AKG^TLt/;DV^F^^^IVVTK^AArT^VPC«VGPMTVAMLMSNTWR^YQNFS 

'T'n_!}'i i*'. "SHOSt^ IS1 c j91 

H(:DKMH:;r^lt:::;::wF^^^w.:^Ix'RY^ML:u\MAMLPK^FLVLu:LCU - :x:coKTTTIEGEQMTI 

i-TI<[Vl/T:M-::AKiiKA::i.:;rjO£DRCFHKID:;[YNNWJPYSEL:U INRAPADVP ITLiJVEL 
::K^l 1 IX»Vr/^[,YKL:;l:CRF^E*'^VGPr 1 KTLWLLHLK^*OTLPPKDVWCOHYKDMGWQHLEFO^ 
f fl'KTL [ Y KN PI I VO [ DU \ JWKGYAVDCLNC IONTFC PNNYVEWC X IE I KTGCHHPiXJRPWR 

f F: :kaa' ;t n ,i; ruf jma l at: ;r ;nh [qkwtvegk :yth r ldtrtc ;kf*lelssyp iqsvswh 

\".'a .'AYAIjA I ATVf iMTFPSK I EAKOWACE1 IH C LTY niDGACo 



CPn_0337 J3214L 381575 

smpB- Small Protein B 

[ EEI FPGNQGKR IL I IVLRPKNCFLLYWTL3P tMGEDLMAQKE IVSNPKALRNYEVIETL 
EAG rVLTGTEIKSLRDHGGHLGDAYVIVSKGEGWLLNAS lAPYRFCNI YNHEERRKRXLL 
LHRYELRKLEGK IAQKGMTLIPLGMFLSRGYVKVRLGCCRGKKAYDKRRT 1 1 EREKEREV 
AAAMKRRHH 

CPn_0338 382272 383375 

KNMKrV\ -RN' r r" ., 'A *\'N7 P 1 1 '." "! fV: . I ET'i r *: ' r " A iTATl'L. "YTTi-' 

AKVYEKGAIJI P^KRKF^*- /KELTEANLElbooAGtMA^ i'l ^GJ JCr kLLJMEKEDFPML 
PDICNAIJ*FSLPA£QLKTMI£RTSFAVSREESRYV^^ 

IDAEVTLDKSFSGEYI IPIKAVEEIIKMCSDEGEAAIFLDQDKIAVECDNTLLITKLLSG 
EFPDFSPVISTESNVKLDLHREELrTLI^QVALFTNESSHSVKFSFLPGELTLTANCTKV 
GEGKVSMAVNYSGELLEI AFNPFFFLDILKHSKDELVSLGI SDSYNPGI ITDSASGLFVI 
MPMRLHDD 

CPn_0339 383405 384034 

CT339 hypothetical protein 

VTTLPMF>DCICSLKLiKNFRNHSDLEI SIAPKL^AGGKTNIXEALYVIjSLGRSFRTQHLT 
DTITFGSSHFFLETQFEKDHLPQALS I YTDKQGKKICYNQLP IKTLSQL rGKVPIVLFSS 
KDRIiISGAPADRJ^FLNLIXSQCr^HYTl^LSYYHRALCX5RNAI^ 
WSNTAPTYPSNGFSWRNFQIYPKNFGLTT 

CPn_0340 383842 384156 

(frame-shift with 0339) 

PLYPLLIVLSSRSSAEKCSLiOCQANL^GLWDEQLVKHGTYLSIORFLCSQKLSDLSKEL 
WSNNLKEQLALKFKSSLIKNSDI SETAVAEEFHKQLS ISLPRDLE 

CPn_0341 384160 384495 

(frame-shift with 0340) 

GSTSVGPHREDFLLTMNC^PVSQFSSEGQKHSLiLAIIJlL 

HAG Lrj^ERVGQLLDPAPTLGQTL I TSTHMHGEL P KT S LVL S I ENAQ VS EQ 1 1 

CPn_0342 384619 385062 

predicted OMP [leader (19) peptide] 

HMKKFLLTILFI^VGNPLFSETSVIQTLPSGIGGLKETSKOKESVV'C^HAFLRSYTSLKP 
IARVl^CEHYX5VFIWm^RJCFTI^2<i{A£H IGGVIVR 
VAXAHPDCP EEAKKEKLFSWLLRTQGLH 

CPn_0343 384999 385595 

(frame-shift with 0342?) 

LPRRSQKRKAILMAPPNAGSTLARRYRCVKFVQFATGGK^ 
SLI)VLILSGNRHSKFLPFRLFYENrX3KVCTrETKIJ7rPH 

MKEFI^EGmTPIIEHVPEAALEQTVMXDKQKNSRLKPYPNQDIYVIHCFGSRPYNLYGF 
PKKWSLNQKNEINPEKLEK 

CPn_0344 387432 385558 

yaeL-Metalloprotease 

SSRYWTIIYFII^ALAi/SILVLIHELGHLWAKAVGMAVESFSIGF 

RIGCIPFGGYWIRGMERTKEKGFJ<GKIDSWDIPCGFFSKSPWKRILVLVAGPLANIIi 

AVLAFS ILYMN^RSKNYSDCSKWGWVHPVLQAEGLLPGDE I LTCNGKPYVGDKDMLTT 

SLLEGHLNLEIKRPGYL7\^SKEFAIDVEFDPTKFGVPCSGASYIiLYSNQVPLTKNSPME 

NSEIJlPNDRFVWMDGTIiFSMAQISQILNESYAFVKVARNDKIFFSRQPRVIA^^ 

YLRNELIDTQYEAGUCGKWSSLYTLPYVINSYGYIEGELTAIDPESPLPQPQERLQLGDR 

I LA I DGT PVSG SVD I L RLVQNHRVS I IVQQMSPGELEEVNSRDADKRFIASYHSEDLLQI 

LNHLG ES H P VEVAG PY RLLDPVQ PRPW I DVYS S ESLDKQ LEV AKK I KNKDKQRYYLERL D 

AEKQKPS LG I SLKDLKVRYNPS PWMLSN ITK ES LITLKALVTGHLS PQWLSG PVGIVQV 

LHTGWSVGFSEVLFVIGLXSMNI^VU^PIPVLDGGYILI^LWE^ 

LVPFTFLL I I F F I F LT FQDLFRF FG 

CPn_0345 388587 387436 

CT345 hypothetical protein 

LKVACLKHLAVLGSTGSIGRQTLEIVRRYPSEFKrrSMASYGNNLRLFFQQLEEFAPLAA 
A VYNEEVYKEACQRF P HMQ F FLGQEG LTQLC I MDTVTTWAAS SGI EAL PAIL ESMKKG K 
ALALANKE I LVCAGELVS KTAKENG I KVLP I DS EHNALYQCLEGRT I EG I KKL I LTASGG 
PLU^LEELSCVTKQDVl^PIW^GSKVTVDSSTLVNKGLEI I EAYWLFG L ENVE I LA 
VI HPQSL I HGMVEFLDGSV I S IMNPPDMLF P I QYALTAPERF AS PRDGMDFS KKQTLEF F 
PVDEERFPS I RLAQQVLEKQGS SGS FFNAANEVLVRRFLC EE I SWCD I LRKLTTLMECHK 
VYACHSLED I LEVDGEARALAQE I 

CPn_0346 339690 388704 

070-troD/ytgD-Integral Membrane Protein 
KKGSLMALGPS PYYGVSFFQFFSVFFSRLFSGSLFTGSLY IDD IQI I VFLAI SCSGAFAG 
T F LVLR i<MAMYANAVS HT\"LFG LVCVCLFTHQLTTLS LGTLT LAAMAT AMLTG FL I Y F I R 
NT FKVS EESSTALVFS LLFSLSLVLLVFMTKNAH IGTELVLGNADSLTKEDI FPVT I VI L 
ANAVIT I FAFRSLVCSSFDSVFASSLG I P IRLVDYL 1 1 FOLSACLVGAFKAVGVLMALAF 
L 1 1 PSL I AKVI AKSI RSLMAWSLVFS IGTAFLAPASSRAI LS AYDLGLSTSG I SWFLTM 
m I WKF ISYFRGYFSKNFEK I SEKSSQY 

CPn_0347 391Q78 389678 

069-troC/ytgC- Tntegral Membrane Protein 
TFGTNPEALSRKTIWIVLI>ILSCVFSDTIFL3SFLAVTLICMTTALWGTILLISKQPLLS 
ES LS H AS Y PG LLVGALMAC WFS LQAS I FW I VLFGCAASVFGYG 1 1 VFLGKVCKLHKDS A 
IXFVLWFFAIGVILASYV-KESS PTLYNR INAYLYGOAATLGFLEATLAAIVFCASLFAL 
WWWYRQIVVTTFDKDFAVTCGLKTVLYEALSLIFISLVIVSGVRSVGIVLISAMFVAPSL 
GARQLSDRLSTI L ILSAFFGG I SGALGSY ISVAFTCRAI I GQQAV PVT L PTG PL W I CAG 
LLAGLCLLF3PKSGWVIRFVRRKHFSFSKD0EHLLKVFWHISHNRLENISVRDFVCSYKY 
QEYFGPKPFPRWRVQILEWRGy/KKEODYYRLTKKGRSEALRLVRAHRLWESYLVNSLDF 
SKEGVHELAEEIEHVLTEELDHTLTEILNDPCYDPHPQt I PNKKKEV 

C?n_0^AH 3 l U815 3010^7 

Qf,8-t roB/ytyB-ABC t r^n^porr^r ATPasp 
FGWLLNVKDETFWSVHNLCMT^EHAAVf.YI 1 1 VA- GLGKGGLTA I [/]PNGAnK. f ;TLLKASLG 
L r KPSSrrr/YFFNQKFKKVRQP [ AYM PQRAZ V fjWDF PMTV r , D L A LM' K "YSYKGMWCR ISS 
DDPREAFHILERVGLEUVADRQ [GQLiX X r ;QQOF''AFLARALMQKADLYL.MDELF:'AIDMAS 
FKT^OVL0FXRD0GKTI^VHHDL^HVRQC,FDHWLLNKRLICrGPTDEC-r,NGDT[FQT 
YGOEICLLEOTLKLGRGKCFGC; 

ii*, / -c tnA/yrctA-r.o LuC- Fror-m innflirn; Family 
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W I LKNASR EMDAKMGY I FKVMRW X FCFVACG I TFGCTNSCFQNANS R PC I L5MNRM IHDC 
VERWGNR LATAVL I KCSLDPH AYEMVKGDKDK I AGS AV I FCNGLGLEHTLSLRKHLENN 
PNSVKLGERL rARGAFVPLEEDG ICDPHXWMDL3IWKEAVIEITEVLIEKFPEWSAEFKA 
NSEELVCEMS I LDSWAKQCLSTI PENLRYLVSGHNAFSYFTRRYLATPEEVASGAWRSRC 
T S PEGLS PEAQ I 5VR D I MAWDY I NEHDVSWFPEDTLNQDAUCKI VSSLKKSHLVRLAQ 
KPLYSDNVDDNYFSTFKHNVCLITEELCGVALECQR 

CPn_03Sn 3^3 L6 f J 3^1684 

!.-•■ ;i.vv!-; : ;j[jK! , : T'L, I< Vf'A T ' J< ' .I'M WDLCR ^ I . t" ;i /V^Hi):"; I.KKMr.'AA- 
EO LLKM LKT F I L ES ET PRT I NP KP kA P RG S KKRRDF I NFTKT D I ERVLELARGVGD KDLL 
ARFSPKKPLTSLKREL IRS I RNG IVSVELWNAYVEAVKAVSS PNLEVTS PFV 

CPn_0351 393861 395432 

adt-ADP/ATP Translocase 

KI KVFQRVNMTKTEEKPFGKLRSFLWP IHTHELKKVLPMFLMFFC ITFNYTVLREiTKDTL 

IVGAPGSGAEAI PFIKFWLWPCAI I FML I YAKLSN I LS KQALFYAVGT PFL I F FALF PT 

VIYPLRDVLHPTEFADRLQAILPPGLLGLVAIIJINWTFAAFYVLAZLVGSVMLS 

ANEITKIHEAKRFYALFGIGANISLI^ASGRAIWASKLRASVSEX^VDPW 

I VSG LVLMA S YWW INKNVLT DPR FYN P E EMQKGKKG AKP KMNMKDS FL YLARS P Y 1 LLLA 

LL V I AYG IC INL I EVTWKSQLKLQYPNMNDYS EFMGNFS FWTGWSVL IMLFVGGNVT RK 

FGWLTGALVTPVMVLLTG IVFFALVI FRNQASGLVAMFGTTPLMLAVWGAI QN I LSKST 

KYALFDSTKEMAYIPLDQEQKVKGKAAIDWAARFGKSGGALIQQGLLVICGSIGAMTPY 

LAVILLF I IAIWLVSATKLNKLFLAQSALKEQEVAQEDSAPASS 

CPn_03S2 395478 396830 

No robust homolog present in Genebank/EMBL as of 11/7/98 

WVGIFFINSHFTNSYAFFNQKVIITVRHSGCTMKCSPLTLVPHIFLKNDCECHRSCSLKI 

RTIARLILGLVIJ^VSALSFWIJ^ISYAIGCT^ 

PNELQK I IYNRYPKEVFYFVKTHSLTVNELKI F INCWKSGTDLP PNLHKKAEAFG I D ILK 

SIDLTLFPEFEEILLQNCPLYWLSHFIDKTESVAGEIGLNKTQKVYGLLGPIJiPHKGYTT 

I FHSYTRPLLTLI SESQYKFLYSKASKNQWDS PSVKKTCEEI FKELPHNMIFRKDVQGIS 

QFLFLFFSHGITWEQAQMIQLINPDNWKMIX:QFDKAGGHCSMATFGGFL^ 

SNY E PTVNFMTWKELKVLL EKVKES PMH PASALVQ K I CV^^ITHHQNIJ J KRW^FVR^T S S Q 

WTSSLPQYAFHAQTYKLEKKIESSLPIRSSL 

CPn_Q353 396893 397135 

No robust homolog present m Genebank/EMBL as of 11/7/98 
LRFRNIKKSLrFIKRIRYSQSGKEQKGARPFFKKSITSSLVILLLEAIFNENFSSIIQNN 
FNKNFKNKNIS INRI FVKFT I 

CPbj3354 397062 398507 

No. robust homolog present m Genebank/EMBL as of 11/7/98 

YK&3 IKILKIKTFLL IGFLL^^^Y^^^QIDEPRKCMSNITSPVIQNNRSCNYYFELKNST 

Tlff^VlSAILlXGALIAFLCVAAPVSYILSGAI^LGLLIALIGVILGIKKITPMISSKE 

QVF^PpELVNRI RAHYPKFVSDFVS EAKPNLKDL I S F I DLLNQLHS EVGS STNYNVS EELQ 

QKf[€£rF EG I ARLKNEVRTASLiCRLE^ AAS SRPLFPSLPKI LQKVFPFFWLGEF ISAGSKV 

VELfiRVKKIGGSLEEDLSDYIKPEMLPT 

QHil&AALNGEWNIiaHSDLNTMKC^ 

RYS^QMSLIKTVPADLWENLCCLTLDHTGRPQDMEFA^ 

LT^'SLDQFKTIRJIQSTNIAMFLENIATHNSTFRSLPPITVHPLKRSVFSQPEEDESSLL 
CPf^355 399955 398591 

No Robust homolog present in Genebank/EMBL as of 11/7/98 
IRDFYLHI IYTAFNRS ISKELAMSMTIVPHALFKNHCECHSTFPLSSRTIVRIAIASLFC 
IGALAALGC LAP PVSY I VG SVLAFI AFVI LSLVILAL I FGEKKLP PT PR 1 1 PDRFTHVTD 
EA^SLSISAFVREQQVTLAEFRQFSTALLCNISPEEKIKQLPSELRSKVESFGISRiiAGD 
LE?J^PIFEDLLSQTCPLYWLQKFISAGDP<3VCRCLGVPRECYGYYWI^PLOTSTAKAT 
IF^KOTHHILQQLTKEDVLLLKNKALQEKWDTDEVKAIVERIYTTYTARGTLK^ 
KEtlSKELLLLSI^GYSFDQLQLITQLPRDAWn^FVDNSTAYNLQLCALVGALSSONL 
LDESSIDFDVNLGLYVIQDLKEAVQAFSASDEPKKEIX3KFLLRHLSSVSKRLESVLRCGL 
HRI ALEHGNARARVYDVNFVTGARIHRKTS I FFKD 

CPn^$356 400465 400109 

No^ftabust homolog present in Genebank/EMBL as of 11/7/98 
KQY^FQ YMNESGWDWLCD FDSQG EG FQLSRLVGLLH S SWALYEAKEQFYLP EVS LLTWE 
EL35i!WQLLSKPTKHGVAKDLCNVFEKHFQRFRQYLGSL^ 

CPn_0357 401341 400469 

No robust homolog present in Genebank/EMBL as of 11/7/98 

YSSHNGASMVNrQPWRNTQVNYSOATQFSVCQPALSLIIVSWAAVLAIVALVCSQSLL 

SIELGTALVLVSLILFASAMFMIYKMRQEPKELLIPKKIMELIQEHYPSIWDFIRDQEV 

SIYEIHHLISILNKTNVFDKAPVYLQE?XLQFGIEKFKDVHPSKLPNFEEILLQHCPLHW 

LGRLVYPMVSDVTPGTYGYYWCGPLGLYENAPSLFERRSLLLLKKISFGEFALLEDGLKK 

NTWSSSELVQIRQNLFTRYYADKEEVDEAELNADYEQFDSLLHLIFSHKLS 

CPn_0353 401757 401578 

No robust homolog present in Genebank/EMBL as of 11/7/98 
EEVLSVSMKLIPTQDSIERETDSKRDKKIFTIYICSSKVLAGHFFSHLDKHNKIHESIGV 

CPn_0359 401994 403817 

lepA-GTPase 

ITLQYI LKEYKI ENIRNFS I IAHIDHGKSTIADRLLE3TSTVEEREMRE0LLDSMDLERE 
RG IT IKAHPVTMTYLYEGEVYQLNLI DTPGHVDFSYEVSRSLSACEGALLIVDAAQGVQA 
QSLANVYLALERDLEI I PV LNK I D L P AAD PVR I AQQ I EDY IG LDTTN 1 1 ACS AKTGQG I P 
AILKAI IDLVPPPKAPAETELKALVFDSHYDPYVGIMVYVRI I SGELKKGDR ITFMAAKG 
HSFEVI/3IGAFLPKATFIEGSLRPGQVGFFIANLKK r /KDVKIGDTVTKTKHPAKTPLEGF 
KE [NPWFAGIYP IDS3DFDTLKDALGRLQLNDSALTIEQESSHSLGFGFRCGFLGLLHL 
KI IFER I TREFDLDI IATAPSVTYKVVLKNGKVLDIDNPSGYPDPAI IEHVEEPWVHVNI 
ITPQEYL^N IMNLCLDKRG I WKTEMLDQHPLVLAYELPLNEIVSDFNDKLKSVTKGYGS 
KDYRLGDYRKCC [ I KLCVL INEEP t DAFGCLVHRDKAESRGRS ICEKLVDVI PQQLFKI P 
lOAArtlKKVtARETTRALSKNVTAKCYGGDtTRKRKLWEKQKKGKKRMKEFGKVSIPNTA 
V ' KVLKLb 



fTji.n hypnt.h^r ic.i L pror^m 

VA[,(J'[ , r|[^[.[(lLAVMGKNLVLNM[DHGFGV:;V , ('NPTPEKTRDFLKEYPNHPEL , /GFE. r ;LE 
iJf-VN^LKRI'HKtMLMiyAtlKI-VDOiiHIALLPn.EKJ&VI I DGGNSYFKDSERPCKELQEK 

«' : i [ . f i / a ; i : :t x : e ixia r m ; p ' ; i m pcg n p e aw p lv a p i fq : j i aak vqc ; r pcc rj wvotog ag 

HYVKAVffUG rRYGp fOLft-'KAYG I I.RDPLKLJATAVAT TLKEWNTLELESYL t R IASEVL 



ALKDPEG I PVTEJT ILDVVGCKGTGKWTAI DALNGGVPL.'L I IGAVLARFLJ ^WKEIREQA 
ARNYPCTPLIFEMPHDP3VFIQDVFHALYASK I IGYAGGFMLUjEASKEYNWGLDLGEIA 
LMWRGGC r IQSAFLDVIHKGFAANPENT5LIFQEYFRGALRHAEMGWRRTATAIGAGLP 
r PC LAAA rT FYDGYPTAS S SMS LAQC LRDYFGAHTYERNDR PRG EF YHTDWVHTKTTERV 
K 

CPn_03bl 40*S650 405382 

tyrS-tyrosyl -.RNA Synthetase 

TL^NNADWLQEI-LlL/rLuulLiKIir t« laJwML* " i fvjrtVH JLEG I JYTEFJ'i LILQSYD 
FYHLFKNYGT ILQCGCSDQWCNITSC IDFIRRKGLGQAYGLTYPLLTNACCKKIGKTESG 

tvwldsdltspfelyqyllrlpddt i pk i art lt llsnee iqd idrrvqtdpvavkefv a 
qdilsaihgdlgleealsvtrsmhpgnlsslsekdfhelfaggmgasldksevlgkrwld 
lflvlglckskge irrli eqkgvy innvp ianehs^'ceeod icyghyvllaqgkkrklvl 

YLN 

CPn__0362 407843 407055 

f liA/rpsD-Sigma-28/WhiG Family 
LDKKKF/KTQCTQNIIEVWNFYWE^ 

DLYASGVEGLVRAVERYNPERSRRFEGYAVFL IKAAI I DDLRKQDWVPRSVHQKANKLSG 
AMDS LRQS LGKEPTDLEIXT EYLNI SQQ ELSGWFVSAR PAL I VSLNEEWPSQS DEGAGMAL 
EER I PDERAETGYDVVDKQEFS LCLANA I QF^ EEKERKVMAL YYYEELVLKE IGKVLGVS 
ES RVSQ I HS KALLKLRAALSAFR 

CPn_0363 409700 407943 

f lhA-Flagellar Secretion Protein 

EAVFVSGKKDGVRGMIFVPLS ILVLI FLPLPQ ILLDFGLCIS FALS LLTVCWVFT LNS S N 
SAKLFPPFFLYLCLLRLGUJI^STRWIVSSGTASSLIVSLGSFFSLGSLWAATFACLLLF 
FVNFLMVS KG S ER I AEVRS R F FLEAL PAKQMALD SDL VSGRA S YKAVKKQKN AL I EEG D F 
FSAMEGVFRFVKGDAI ISC I LLLVNWSVTC L YYTSGY ALEQMWFTITjGDAL VS QVPALL 
TSCAAATLISKIDKEESLLNYLFEYYKQLRQHFRWSLLIFSLCC I PSSPKFP IVLLASL 
LWLAYRKEEPAS EDSC I E2^AFS YVEGAC PKEQESQFYQWRAAS EEVFEDLGVRLPVLT S 
LRIEERPWIJiVFGQNVYIJDEMTPEAVLPFLRNIAHEAI^AEWQKYLEESE^ 
IVPKKISLSSLVVLSRLLVRERVSUCIJPKILEAVAVYQNSGDSLEILAEKVRKSLG^ 
GRS LWDQKGTLEV I T I DFHVEEL INS SYS KSN PVMQ ENV I RR\T!S L LER SVFKDFRA IVT 
SCETRFEMKKMLDPHFPDLLVLSHDELPKEIPISFLGIVSDEVLVP 

CPn_0364 409954 410238 

fer4-Ferredoxin IV 

KENSMAKLVITS DDEQQEFELEDNSE I AEPC ESMG I P FACTEGVCGTCV I EVLEGRENLS 
EFTEPEYDFLGEPEDSNERLACQCRIKGGCVKVTF 

CPn_0365 410498 411544 

No robust homolog present in Genebank/EMBL as of 11/7/98 

FKGTQVNSLIMATI SPISLTVDHPLVDTKKKSCSNFDKIQSR ILL ITAI FAVLVTIGTLL 

IGLLLNIFVIYFLTGISFIAVVLSNFILYKRATTI^KPRACGKHKEIKPKPVSTNLQYSS 

ISIAINRSKENWEHOPKDLQNLPAPSALLTDNPYEIWKAKHSLFSLVSLLPGGNPEHLLI 

SASENI^KTLLIEETSQNAPISSYVDTTPSPKSLLNEAIQErTRVEIOTELPAGDSGERLY 

WQPDFRGRVFLPQIPTTPEAIYOYYYALYVTYIC/TAIOTNTQIIQIPLYSLREHLYSREL 

P PQS RMQQSLAM ITAVKYMAELH PEY PLT I ACVERS LAQL PC/ES IEDLS 

CPn_0366 411976 412440 

No robust homolog present in Genebank/EMBL as of 11/7/98 
MGYLPVSATDVLFESPAAPLINSANTQNQKL IELKGKQQAESSPRT ITSVI LEVLLVIGC 
CLIVLSLLAIRPALQFTLETGHPAAIAVLAVSGTILLVAVIILFCFLAAVPFAAKKTYKY 
VKTVDDYASWHSHQQTPTLGT I FSG I VYAESQAQL 

CPn_0367 413078 413836 

No robust homolog present m Genebank/EMBL as of 11/7/98 

SFPLNRYFMTKTTSIPDVHE24QSHLSVT)ERLISESPVLTKKEVIAKIIKLTALILJUjAIA 

VGTAWAGVLGMPLMAIATGAALLAAVVLSCLLLRRREPSKPTEELLGPQKHVPKDIAAQ 

VQPSVPLDYOKLLRNEWTLVNTLSEINISOTLQDPNQRYYWEHCGAPITLVATTGDIAK 

PRLKTSGRVMIVTJAANSNMQSGGAGTNAALSAATHPTCWNOTRTSG^ 

RSAPWI NRDWTNK 

CPn_0368 413766 414107 

No robust homolog present m Genebank/EMBL as of 11/7/98 

TLAKDYLWVNAAQHPGSIETGRINDTNPGEAHFLAQLLGPKYEGELKAHPEKLSNVIKJCA 

YLNCFDEALNNQATWQVPLISSSIYSPGGKLELEPVNQTKPNSSAYKLYHIRT 

CPn_0369 414345 415562 

CT058 hypothetical protein_2 

NIMTDSNPLPSYTDASLYRTPAKHSYPIRLPLNRTDRIEKILKIVTLTLALACALGFSIA 
AGILAMPIFSAVWITLAIAAVSLYSLLKKPKLYEILPQIEPESEQSSLSPSPQPPEQQD 
LPLQIDPLPDPESLP EVSLADLTT PPEELTA I TVTPGYEALL EQNWDLL PS LAAVD PS FT 
TET PQQ PCF rWKLKDSKL I FI STSGDI AVPR I KTQGRVM I VNAANEN I SREGGGTNKALS 
LATSLQCWNASRLFRAHSRSGSQLQPGECRSAKWENSDHTSNDHVPGKAHFLAQLLGPEA 
AKCNNDPKOAFEVSKKAFHNLFQEAEI IGVDVIQLPLIGCNLFAPSRLLNLGKTRAEWI E 
AIKLALITSLQDFGWEQDNOEEQKI I ILTDKDQPPI IPPRFDLTTP 

CPn_0370 415755 416912 

CT058 hypothetical protem_3 

KR I FFKLFVFYLKSFMSTTEPNLTNVNLTML I3SESMPTQLASHKLKGLDLVAF ILI IG I 
AVSSGTAAI ILGIPLLFILTALAVLAFSILLYFLLREPKSPISVTHQPTPITfCDTDLPPV 
P P LALTPVPT EAVL EE PPLPSPRTHQTLLO ENWDR I PDLQ ANTDM P F I AADNQTGY AWH L 
KNSNLTLI STLG F I EKPRYKTQG I VM IVNAATPNMANNVKGTSLALAKATSVRCWENSKK 
SPDPLRSKQPLQLGECRSAKWENUIGTTNAGKAGLPOFLGQLLGPKASDYNYNPNDAFTF 
CRQAYLNCLNEAKRRKTT\A'0LPLLCf3KFPGwPKDEETTSLRL0WIDGVKLALIDALQTF 
GSEAENQNQPWVI ILTTLARHPLITP 

CPn_037 1 4L7M1 4 17V; J 

No robust homolog ptt?senr. m ( lern.-frfnk /FMBL .v: ot ll/Z/^H 
KTMPVSfjAPLPT^MRP^.SGNLAlLMEr-'fl^K AT, K/K I IQDKTTKT I KLLVK I LVA I LVI EVLG 

[rAAFFrpGTPP[CL[ruv[ J iLrrvL/-v[ l [j,7rKLA[ J wiKTr:(rr'rAF00[KnKL^sK3[ 



i*Pn_0 172 4 I ' c S 1 4 I KU ! > } 

Ha tobunr homo loci pu'vcnr in c„ n'-h-uik/i .MliL .i:. ot LW//-IH 
NYRACHRN IClNMMS^PWTGT , ";^Ar J F j V[ 'SJTYi/ 'A-'.FlA'Hl.r/): *f 1R( * IK [ AFAA'I'PALLLLN 

Trv3GrvA[AMm'vrsvt^A\TTVKjria-i4.;i,nxA[M[,i:;MYKrTnrsoNT['[:;N 
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(:Pn_037 i 4L8356 420218 

gcpE 

NS E r FE I FMTL I TPA INSS P RKTHTVR IGNLY IGSDHS IKTQSMTTTLTTDI DSTVEQ I Y 
ALAEHNCDI VRVTVQC I KEAQACEK I KERL I ALGLN I PLVAD IHFFPQAAMLVADFADKV 
R I NPGNY I DKRNMFKGTK I YTEAS YAQSLLRL EEKFAPLVEKCKRLGKAMRIGVNHGSLS 
ER IMQKYGDTI EGMVA3A I EYIAVCEKLNYRDWF5MKSSNPKIMVTAYRQLAKDLDARG 
WLY PLH LGVT EAOMGVDG 1 1 KSAVG IGTLLAEGLGDT I RCSLTGC PTTE I PVCDSLLRHT 
":v.; -r. :■'-:[■■ Kr.'i <* *r/.'i "*r:: •■ w^rrA^Tr w-rv -'vr;,KLY" -TT^rrELLri' 
'.■ 1 \-:vAr*TTi-y<:w/\ ' '".kpap'tpv: :'FA- r ' v^hhmovf '.l. / ~iwf.eiwd'"p-\ 

VHUAPKVHFHASDPFIHTSRDFE-'EKQGHQGKPTKLyFSRDFDNKEEAAXSIATEFGALLL 
DG LG EA WLDL PNL P LQ DVLK I A FGTLQN AGVR L VKT EY I SC PMCG RT L FOLEEVTTR I R 
KRTQHL PGLK I A IMGC I VNGPG EMADADFGFVGS KTGM I DLYVKHTCVKAH I PMEDAEEE 
LIRLLOEHGVWKDPEETKLTV 

CPn_0374 420209 420961 

CT056 hypothetical protein 

VDSMTLS FHTHPLNYWTFEEFDGLP I RHGVFSKQKDAEGTVFAAKNPE I ASALQS P KYCD 
LH Q RKG T SVRCVTPTS PTYQ P ADG LCTQSPLLSLHIRH SDCQ AA I FYDREHHA I ANVHSG 
WRGLLGNIYAVTVGTMKKLFHTKPQDLFVAIGPSIGPDYAIYPDYATLFPRSFLPFMNPK 
NHFDLRAIARKQLTNLGISKDRIFISDI/TTYTEliDAFFSSRYLAHHPDPNLTGQHSKNRN 
NVTAVLLLPRD 

CPn_0375 421112 421615 

No robust homolog present in Genebank/EMBL as of 11/7/98 
RLSMKLGASTNHKVHE PVK PKKAQ LAE I EANKTQATEGTLRS KSLALO I ARAVLYI LFAA ' 
LMLAAGITFVTFLAJjGFPLIQAYS IAGI XTLVGLAIGLVLLILSLLPKEDEEADALSRNA 
LLPLTI XVIEQQPITPKPEI PYSYLTKLALLTSLFLTLRRSSSQRKTH 

CPn_0376 421680 422294 

No robust homolog present in Genebank/EMBL as of 11/7/98 

FKVVTAKAP^TEIRDHGARVPSLFLLSPETSHWKGDKEVSAPLK^ 

TKMNSRKKAGCWAIFNSPTPGVSSTLVIAWrPWGYYDKDVQDILERKDPMSSSLSEKDSK 

EFLKNLFVDIXEM3FTSVHIHAEEAFTPLDHTGKPHFKPX^T1/Y^ 

VS ADTQ FTL FLTQD ECNPFHDKKRG 

CPn_0377 423441 422347 

sucB-Dihydrol ipoamide Succinyltransf erase 

I^E^IPNIAESISEVWASLLVTEGALIQENQGliErESDKVNQLIYAPVSGRIFWE 
VSf^DVVPVGGVVGKIEPAGEGEI^DSQSKETIEAEIICFPQSGVROSPPENKTFIPLR 
DQMDCCSCGLSAGDRGETRERMTSIRCT 

KQ^EFLSRYGVK^FWSFFVKAVLEALKAYPRVNAYIDGEEIVYRKYYDISIAVGIDRGL 
VyJi^IRDCDKLSNGEIEQKI^LALR^ 

PPQVGILGMHKIEKRPWLDNE IVIAI^MYVALSYDHPJjIDGKEAVGFLVKVKEGLENPA 

c£gj.0378 426195 423445 

sucA-Oxoglutarate Dehydrogenase 

rift EFNYFMDSEFVGQVYSSDMDWI ESMYQRFMNHETLDPSWKYFFEGYQLGQAASPSE 
A^«ISGNETIAbG^EQKSQFI^IYRYYGYLQSQISTIAFTTDSRFIQEKIAKIDLDEQ 
V^SAGLL PKAQVSVREL I EAIJCKCYCGSLTLETLTCT PELQEFVWNLMEKRQVERFAEQL 
IJl^fKDLCKATFFEEFI^IKFTGQKRFSLEGGETLVTMLEHLVHYGSAIXSI 
RG^RIiNVLTNVLGK PYR YVFMEF EDD P AARG LESVGDVKYHKGYVLK S HQ KDRETTFVML P 
K AS H L E SVD P I VEGWAALQHQGHAG KEQ S SLA I LVHG DAAF SGQGWY ETLQLS R VPGY 
STEGTLHIVVN>TYIGFTAVPR£SRSTPYCTDIAKMI^IPVFRVNSEDWACIEAIEYALQ 
VHE&FSCDVI IDLCCYRKYGHNESDDPSVTAPLLYDQIKRKKS IRELFRQYLLEGQFADI 
S$ETLAS IEKE IQESU^FQVUCGTDPEPFPKKECHHCDRLNNGELILHDCDVSLDRET 
LFf?MSSRLCGFPDNFHPHPKIKTLLEKFJ^K>1AEGGV^ 

SQQDS IRGTFSQRHLVWSDTVTGDTYS PLYHLSAEQGSVEMYNSPLSEYAILGFEYGYAQ 

QA^I^LVLWEAQFGDFA^AQIIFDQYISSGIQKWDr^SDIVLLLPHGYEGCGPEHSSSR 

ISRYUJLAAI^FQVVLPSTPVQYFRILREHAiaiDLSLPLVIFTPKl^ 

FTEPGGFRAI LEDADPNYDAS I LVLCSGK I YYDYAEMLPQDRRKDFSCLR I ESLYPLALE 

DI^\^LIDKYSHLKHFVWI^EESKNMGAYDYMFMAI^DILPEKIiYIGRPRSSSTASGSAK 

LSRSELVTCMETLFSLR 

CPnMo379 426268 426765 

CT053 hypothetical protein 

KNKKMLCTCSRIQDGNPWMKSERLKKLESELHDLTQWMQLGLVPKKEISRHQEEIRILEH 
KIYEEKERLQLLKEM3EIEEYVTPRRSPAKTVYPEX5PSMSDIEFVEPTETEIDIDPGETV 
ELE LTDEGREDGAVEVDYS HEDDEDP FS DRNRWRRGG I IDPDANEW 

CPn_0380 426671 427876 

hemN -C op r ©porphyrinogen III Oxidase 

KSTIPTKTMKTLSAIAIAGDAWSLIPMLM^KAPLALYIHIPFCTKKCRYCSFYTIFYK 
S ESVSLYCN AVI QEG LRKLAP IQET H F I ETVF FGGGT PSLVS PLDLKR I LKELAPHAR E I 
TLEANP ENLTVS YLRQLQETP INR I S VGVQTFDDS I LQLLGRTHSSS AA IT ALQECQNHG 
FSNLSIDLI YGLPTQSLEIFLSDLHQALTLPITHISLYNLTIDPHTSFYKHRKILVPTIA 
QEEILAEMSLLAENLLLSQGFQRYELASYAKPDYPAKHNLYYWTDRPFLGLGVSASQYLH 
GERSKNYSHISHYLRAVRKNLPTQETSEILPKKERIKEALALRLRLLEGADLAEFPSTLI 
SMLTQDVKLQNLFSVHGQCLALNRQGRLFHDTIAEEIMGYSF 

CPn_03 31 429836 428037 

CT326 similarity 

SLPNKFRALMTAPTESRSSPPTLLEETEPLSPNPIPADIQIPRITISPPSLDVSTVASSA 
ED 1 3VF I AGG PRSSSSASVASDVYELVCLCGG DEDPEP PDSEVRTLYVNGSWQTHQEAVQ 
EL-LYI 3 EVRGEAVRLLYNDGSGMSPWP IS PCRTLPTLDH PLCQALLTVWEQFFSAPENQN 
REFLVIFYGDASPY IQQALTQSRHSPRIWVC I SPTVFIQGDFRVHNYRVSGDFFSSLDC 
RGTPAErJTTILPYSSGLEGVFLPSIRCPSFTWAVRFGEQCLVANRGEDVEDRGGLSQDAE 
RSOLPH.nERDLAWIDSTDPSSMSRLVEWLNOGSPSSDKEINPYPORCPDVALSALYAIS 
RV"OLAOISWTU\r;VHEGLDLOICYSLILMHTTF^\mYFFLLrTNYPQSREPFRTARrVAQ 
: II.YLP3 [ LVLVFDCONVLRKLWMPQEILRAI F ISA3T I 3GS EVFVECTRWMGRGLRHRVQ 
VFVOORVECCGLPVGTVRASYRDRAGF I EGFLQTVHGOLYLPV3 EMVLNQIA IQVPR ELV 
PIT irrrAWDLHNK.'JAEENWiJdGDVLAVGQTLNF I LCAFVLFVNLWFFVKSVLRHSRRRRR 

( :i'ti_l>iH.» 4 307S? 4 100 ih 

y.jtjf.Vy ciL-:*AM - r)»-i«*ndent Met hyr r,instpr-isp 

i'VTLYL[T^fTF/;TRAVCTLP:;vrGELVnRLD';LIVESDR(X5RArU^LWKrPEVHKFPLAI 

r ,:;kiiar lpkawdfylep evkhge^jwgl isdaglpc iadpga:;lvrraral/j i pvqafsgp 

<::;rTL.ALML^(;LPSQ:>r--'rFLl]YLPO:JPKERVKSrKKAATGKFV:JTSVi , [ET':YRNVYTFE 

:;lldti,p:;yai-:li 'va::dl'*gp^elvltrqvq.jWRttedlgsvko. l ) etkvpte flfh ipn 



CPn_0393 43171 L 43074'^ 

CT047 hypothetical protein 

VQE7TTFLTLPMQKSLT3FDDF5QAYAEKVPAIALIG3ALEDDKDAL EELLV3ESFKELiCG 
QGLMPATLM3WTETFALF0EHETLGI IHAEKFPLATKEFLSRYARNPQPHLTILIFTTKQ 
ECFRELSKALPSALSLSLFGEWPADRQKRI IRLLLQRASRVO I5CSCSLASLFLRALAST 
SLPDILSEFDKLLC3VGKKTSLDHSDIKELWKKEKASLWKFRDSLLKRDPVEGHQQLHF 
LLEDGEDPLGI ITFLRTQCLYGLRS I EEGSKENKHRMP/LYGKERLHQALN3LFYAETLI 
KNNVQDP I VAVETLV I RMVNL 

t - _ 11 -i i : • i ■ ' ; • 

rictB-Histonfe- i iKe protein J 

VrTCLIRGIKMIGAQKKQSGKKTASRAVTlKPAKK'/AAKRTVKKATVRKTAVKKPAVRKTA 
AK KTVAKKTT AK RTVRKTVAK K P AVK KV AAKR WK KTV AKKTT AKRAVRKTV AKK P VARK 
TWAKGSPKKAAACAI^CHKNHKHTSSCKRVCSSTATRKHGSKSRVRTAHGWRHQLIKMM 
SR 

CPn_0385 434042 432522 

pepA-Leucyl Ammopeptidase A 

FLVIKGEFVVIiFHAQASGRNR\fKADAIVLPFWHFKDAKNAASFEA£FE 

KTGEIEIiYSSPKAKEKRIVI££LGKNEELTSD 

ISELRLSA£EFliVGL5SGILSI2iYDYPRYNKVT5RNLETrPLSKVTV 

AAIFEGVYLTRI)LVNRNADEITPKKLAEVAIJ^LGK£FPSIOTKVLGKDAI 

VS KGSCVDP H F I WRYQG R PKS KDHTVL I GKGVT FDSGGLDLKPGK SMLTMK EDMAGG AT 

VLGILSALAVLELPINVTGI I PATENAIDGASYKMGDVYVGMSGLSVEICSTDAEGRLIL 

ADAITYAIJCYCKPTRIIDFATLTGA^^Sr J GEEVAGFFSNNDVIJ^DLLEASAETSEPLW 

RLPLVKKYDKTLHSDIADMKNI^SNRAGAITAALFLQRFLEESSVAWAHLDIAGTAYHEK 

EEDRYPKYASGFGVRS I LYYLENSLSK 

CPn_0386 434543 434046 

ssb-SS DMA Binding Protein 

KS KGYLMMFGHFAGYLGADPEERMTS KGKRVI TLRLGVKTRVGMKDETVWCKCN IWHNRY 
DKMLPYLKKGSGVIVAGDI SVESYMSKDGSPQSSLVI SVDSLKFS PFGRNEGSRSPSLED 
NHC^VGYESVSVGFEGEALDAEAIKDKDMYAGYGQEQQYVCEDVPF 

CPn_0387 435229 434699 

CT043 hypothetical protein 

NNNNU^DSLMSRQNAFXNIJ<NFAKELKLPDV I LFVDGEFSLHLTYEEHSD 

RLYVYA PIJLIX3LPDNTQRKIALYEKLLEGSMLGGCflAGG^ LMHCVLDMKY 
AETNLLKAFAQLFIETVVKWRTVCADICAGREpSVimiP^^ 

CPn_0388 435323 437320 

glgX -Glycogen Hydrolase (debranchmg) 

STMEKVSSYPSVPLPLGASKISPNRYRTALYASQATEVIIJUjTDENSEVIEVPLYPDTHR 
TGAIWHIEIEGISDXJSSYAFRVHGPKKHGMQYSFKEYLADPYAKNIHSPQSFGSRKKQGD 
YAFCYLKEEPFPWDGDQPLHLPKEEMI IYEMHVRSFTQSSSSRVHAPGTFLG I IEKIDHL 
HKLGINAVELLPIFEFDETAHPFRNSKFPYLCNYWGYAPLNFFSPCRRYAYASDPCAPSR 
E FKTL VKTLHQEG I EV ILD WFTJHTGLQGTTC S L PW I DT P S YY I LDAQG HFTNYSGCGNT 

L^^n^RAPTTQWI LD I lrywveemhvdgfrfdlasvfsrg psg S PLQFAPVLEAI SFDPLL 
ASTK 1 1 AE PWDAGGLYQVG YF PTLS P RWS EWNG PYRDNVKAF LNGDONL IGTFASRISGS 
QDIYPHGSPTNSI>TrVSCHrX3FTU:DTVITNHKHNFJ^HSNR 

EDPGIL EVRERQLRNF FLTLMVSQG I PM I QSGDEYAHT AEGNNNRWALDSNANYFLWDQ L 
TAKPTLMHFLCDIj I AFRKKYTCTLFNRGFLSNKE I SWVDAMGNPMTWRPGNFLAFK I KS PK 
AHVYVAFHVGAQDQLATLPKASSNFLPYQ I VAESQQGFVPQNVAT PTVS LQPHTTL I AI S 
HAKEVT 

CPn_0389 438254 437319 

CT041 hypothetical protein 

TVFNFKRFYQKDSQRQNGNTTCLRPFKKTCKELI EFRRRTVKLLIOWLLGLFFSMS I SGF 
S EVKVS DT FVKQDTW E P KI RVLLSNESTT AL I EAKG P Y R I YG DNVLLDT A I CGQRCWH 
ALYEGIRV«EFYPGUXUCIEP\nDDTASLFFNGIQYCGSLYV«RKDNHCIMVSNEVTrED 
YLKSVLS I KYLEELDKEAL SAC I ILERTALYEKLLARNPQNFWHVKAEEEGYAGFGVTKQ 
FYGVEEAIDWTARLWDSPQGLIIDAQGLLQSNVDRLAIEGFNARQILEKFYKDVDFWI 
ESWNEELDGEIR 

CPn_0390 439171 438134 

ruvB-Holliday Junction Helicase 

RKSDREGSYMTHQVAVLHQDKKFDVSLRPKGLEEFYGQHHLKERLDLFLCAALQRGEVPG 
HCLFFGPPGLGKTSLAH IVAYTVGKGLVLASGPQLI KPSDLLGLLTSLQEGDVFF I DEI H 
RMGKVAEEYLYSAMEDFKVDIT IDSGPGARSVRVDLAPFTLVGATTRSGMLS EPLRARFA 
FS ARLSYYSDQDLKEILVRSSHLLG I EADSSALLEI AKRSRGTPRLANHLLRWVRDFAO I 
REGNC I NGDVAEKALAMLL I DDWGLNE I D IKLLTT 1 1 DYYQGG PVG I KTLSVAVG ED I KT 
LEDVYEPFLILKGFIKKTPRGRMVTQLAYDHLKRHAKNLLSLGEGQ 

CPn_039I 439701 439510 

No robust homolog present in Genebank/EMBL as of 11/7/98 
KDQLYKQEKP I PKAT I LSRNLEVMLDMPKGKRQTLFLGRTSGRS ALYSYSRR I LVLLNAF 
MRGP 

CPn_0392 439814 440383 

dcd-dCTP Deaminase 

MS I KEDKWI REMALNADMIHPFVNGQVNVNEETGEKL ISYGLSSYGYDLRLSREFKVFTN 
VY NSWD PKCFTEDIFIS ITDDVC I VP PNS FALAR S VEY FR I PR^A/ LTMC IG KSTYARCG 
1 1 VNVT P F E P EWEG HVT I E I SNTT PL P AK I YAN EG I AOVLFF ES STTC EVS YADRKGKYQ 
KQCGITVPCV 

CPn_03 93 440229 440723 

TT03B hypothetical protein 

KFLTLRHC0RKFTLMKGLPR5Y3LSLVPPARFLM0TEKE3IKSNKASPYLV5KVSVRKKN 
WGFRLLEEVMI KSWWV [ FS I L EOGFVYDRAIOELRTEELRLQSKVSSLCQDILSAQEKQR 
0 LC LH LO [■fWO DC AA [ EAAL E OR LG L I P KGY K K I .CVS P KOQ 3 ENK D 

CPn_0!'j4 44 07 27 AAl'ti.H 

t Ly(. - f .R'l Domain prot « lq f Hf -no ly. i n flomoLoq) 

KHTMIPTMLMFFI ECFTU '.TGF ICU'Sj XhUVlU^U [."KYKIi:*Kt»KKO0RVATLLLHPH 
HLL ETL I FCD EGLN I A EQNC FA [LFGDAAwWW/i TV( It. PI, A ITT, I UIET LPKAVALPFNTQ 

l a:;:;vapl i [a'vtk r fkpllhwt; evo urtwfjvt i uwujiot i(.u-OF.LKr:vL,(j:x;KDFGV 

VN0EE3PLLYCYLSL^DC3VKERM0P^UL[,f'YfH0'rr'r.L-.NLYLLF::K0Hr:;RVPICNDN 
LUN[J/;iCTAR'^a>LHDKPLOr::;DDLIJ'IJ,KKP/VMEM:TE.';AKMALCOMA/\EDF/ru;Mr E 
DEYfj.'J EEGI, ITOEDLFEIVACE LVV^UW I [,nT: :« 1AIA/ IT LI IL.RE mi: I FD ENL 

F-rNNM EAT [CCMIAFQ EOT [ PTT^^MKL^WtirJI .I.FOVI .UAAl'MR ERUVY I KKLYO 
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CPn_Ur> 1 ) 44L'.>55 44 3L75 

CT257 hypothetical protein 

GNCMTNSALFWICVN I ICIVLQGFYGMMEMACVGFNRV-RLQYYLTKDHKKAfiy INFLIRR 
PYRLFGTVMLGVNIAL^VGSEoSRNCYRALGITPDYAPFTQIFIWIFAEIiPLTISRKI 
PEKLALWCAP I LYY 3HY I FYPL IQL IGSLTEGLYYLLNI RKEKLNSTLSRDEFQKALETH 
HEEQDFNT I ATN I FSLSATCADQVCQPLEQVTMLPSSANVKDFCRT I KNTD INF I PVYHK 
AR KNV I G r AH PKDFVNKALDEPL INNLHS PWF ITAXSKL IR I LKEFRDNRSSVAWLNAS 
GEPIGI LSLNAI FKILFNTTNIAHLKPKT ISVTERTFPGNSR IKDLQKELDIQFPQYPVE 

tlaoly' .>_■! ,i ,'cr \rv '.tz'J' vim.:.:.. , /r? 1 *l :g :ktv: :knll 

CPn_03 96 444.159 44 3241 

yhfO-NifS-related protein 

YSMIYLDNNAMT PPERGLLEFLQKTFLI EGTYANPS SVHQLGKKSRQLVLEASHWMQKVL 
SFQGRVLYTSGATESLNLAIASLPKDSHVITSGSEH PAI LEPLKHSSLSVSYLNPEEGRC 
VLT I EQIERAVTPKTSAI I LGWVNSETGAKADIAAIAHFAQERQU3FIVDATANVGKERI 
VIi PSGVTMAAFSGHKFHALSG I GALLVS PGVKI^PQLWGGGQOGGLRAGTENLWG I ASLL 
Y I FKYLDLHQER I SQE ILTHRNGFEKAI KAR I PDVH I HCADQ PRANNVSAI AFPPLEGEV 
LQ IALD I EGVACGYGSACSSGATAPFKSLVS>K3VDEELTIATLJIFSFSHIIJ1*QEDVERAV 
GI I EKWERLKNS 

CPn_0397 445124 444381 

" PP2C phosphatase family 
EHFVDFDYFGLSDIGRVRARNEDFWQVNI>!SQWAIADGVGGRI^ 
I DEQQS KLMGYGDDQYKETLKK IIJIjEVNGVVYEHGQMEEHLQGMGTTLS F I Q FRKDRAWL 
FHVGDSRIYRI REGELRRLTEDHSLENQLKNRYGLPKQSDKVYSYRH ILTNVLGSRPYVM 
PDIRNLPCEKEDLYCLCSDGLTbMVPDIDIRDILNQPATLEERGNALISLANTRGGDDNA 
TWLVRIQ 

CPn_0398 445518 445700 

No robust homo log present m Genebank/EMBL as of 11/7/98 

I EELPMQ I ENSS I LFAEWMKWF I FS VI SAPWFLPGCTLI PKEXVTKVPSQLWSESLSQ 

P 

CPn_0399 445759 446523 

CT253 hypothetical protein 

YKLMRVL^KSLNCESIDI^SKNFPRARIFCKISNLRT^^ 

CTHLGSSGSYHPKLYTSGS KTKGVIAMLPVFHRPGKSLE PLPWNLQGEFTEE I SKRFYAS 
EKVFL I KHNAS PQTVSQ FYAP I ANRL PET 1 1 EQFLP AEF I VATELLEQKTGKEAGVDSVT 
ASVRVR VFD IRHHKIALIYQEIIECSQ PLTTLVNDYHR YGWN SKHFDST PMG LMH S RLF R 

EWARVEGYVCANYS 

Cfri.0400 446527 447306 

CTZ54 hypothetical protein 

SKElSKFILIiSLGVAALASKKFFIWPAPSGKTPIJa^QVLFGGALLVFSSLVALSVSSQ 
TA^LSTMTGISLAFAFLFYLLFLPKDITPAILFSGERPVKTSWRALXjSAIRMWIIIIPV 
TQ^IGIMMSKFLTLVLPTQEIHTQEVTQEVQNSLPITGHYISMIIJJLGVLTPFGEEVFFR 
G I LQT FLKNKMTRI AAVLC SS I 1 FS F I H I EHSLGSWVFVPVLFVFSLSAGFLYEKDRH I L 
S#£ALHGLFNLTSLLFLGIK 

CPnl0401 447884 447495 

Ci3 : S5 hypothetical protein 

MRP.HAFSKiIGTVRAMVVEGRCPWSL(X}SLVSMVEHILGECQEFHEAVLG^ 
AGD^LTLVL I LCFLLEREGVLASEDVANEAMEKLRRRAP Y I FAEDYKPVSI EEADRLWEL 
AiC^lEKNEST 

CPn_!0402 449012 447888 

mi&Y-Adenine Glycosylase 

NRKRFC MTK I AF S EKAKNF PVEALKKWFEKNKRSLPWRDNPT PYSVWVS EVMLQQTRAEV 
VliDYFNQWMERF PT I ESLAAAK EEDV I KLWEG LGYY SRARHLLEGARMVMEEFHGKI PDD 
AljLSJLAQ ^ RGVGPYTVH A ILAFAFKRRAAAVDGNVLRVLS R I FL I ETS I DLESTRTWVSR I 
A^AXLPHKSPEVIAEALIELGACICKKVPC^HRCPVRQACGAWRENKQFVLPVRHARKKV 
I F£jMRLVAI VLYDGSLWEKRRPKEMMAGLYEFPYI EVEPEEGLQD I EGFTKKMELSLES 
■ Pll'EFLGNLKEQRHAFTNHKVHLCPI IFKATSLPQFGELHLLSDIDHLAFSSGHKKIKDAL 
LI$6&GDVRSRESIGV 

CPrp0403 449009 449710 

yceC-predicted pseudouridine synthetase family 
NF^LSNDKRAALQYFMENFSWLATQVSRLSS FLRSQLPNHSKQEILAS IRQHRCRVNGF 
IERFESYKVQPGDRVSLSLrPSTKC^PSILWEDDYSIIYEKPPHLTTEQMAHMTRFFTVH 
RLDKGTSGCLLMGKSKQAATELMKLFKGRK I H KQYI AFVFGH PKKKFGTVKSYTAPVYRR 
CGAVI FGAAG PSQGEP I KS AYKWDCWV I LL S EMSTT DLKNS LPRS S ALS SMLT P 

CPn_0404 450962 449871 

No robust: homolog present m Genebank/EMBL as of 11/7/98 

EL EALEQKYG KA VLL I ALS ELG I DTMS LLSG H RLEG F P P I AEVMAACD RCSMDFC E I LKS 

QSMDLWADAASCVDGLLQDPFWSTAI ASG I AKSSLQETEFEC ESKVMVLSSWGEQGAQVC 

SPFNLERICMSFPSLKVFSLKKNGCENMGIQL3ASCMMLLMSIFFVATNGGSTPIWITKE 

NLMALVALVLSHYO^YFVPATGDPQRGNILGNPEVNAILARGMGMRVDLERKRGGESSSS 

RYLEI^^CFE^SLTKTSLLSDANNVOERDKCLLQMSTSLMHTAGLNLQRPPVPTPSGVT 

AHPQPQPDP\ATTSQPSLLGARERSPVSSRGRFPWLPLSVISPRSHPGRVERRDLEDEEE 

EVMF 

CPn_0405 451814 450966 

CT105 hypothet ical protein 

N I QTSH S RV LLK KFS K EFT I RT YRS LG FT DY LGGC LTN PLGK F PS PQN PQ WT I APS STT 
PQAVS3 AVQG F LQTGG AAS ST ATTTT ASG AS A LGLS PDQVQ ALLTNL LNVGQ PSVGQPST 
SAGTSGASSSSA3MQQQLLQLILDKTTGSGGSSVSSEC3LOQLLSLVSQMTTSQGGSGGTQ 
AGQAAS V L LNLLSATGSAAAN P LGTAAS LAQ I I YAA VTS PGAKKTSE FCYNYCG ETCQGN 
CGCPTCGCPDGQCCCGGFGRFFCCVWKNCCGIGECSQEPAIPL 

CPn_040h 4S1>160 -1SJ8h^ 

tabf -Enoy L -Acy 1 -C.irr ler i'torem Reductase 

CGFMLK [DLTGKVAFVAG rGDDQGYGWO T AKLLAEAGATI IVGTWVP I YKI FSQSWELGK 
KNFJJRKr^NGTL.LErAKIYPMDACFD^PKDVPCDIAENKRYKGtTC-irri^EVAEQVKKDF 
( lit [ D LLVI!fJLAN;;PF,E:'K:;LLETnRKGYL.AALj:A./.; /'CFVSLL^Hr^J IMMRf SGSTISLT 
YLA: IMRAVPGYHt * 1M: ; ' J A K AALEf j DT KT L AW C -\0 R F* WG [ RVNT [ . PLA3RAGKA IGF [ 
L-'.HMVDYYfjKWAP [l'EAMNAEOVr;AVA>\r'LA^PL.V,A rTOCTLYVDHGANVMG IGPEMFPK 
US 

(.Tri_(Mf)Y 4'>W ( W .I'.^HSS 

HAD stipi;r t.mii ly hytit o lri*,<vpliu',ph t ir r^- 



rsTYGDAMEKLLVTD I U7T I Tf f'J^HHLDKKVYER LYALHOAGWK LFFLTGR 7YK Y AAP LFSD 
FDAPYL]JC^Q^ASVWw:rr';GNLLYSKSLP^DLU:iLCDCMEGATALFGVESGAPYGDHY 
YRFSPTPIAQDLHEYVOPRYFPNAKEREILFETRSLKDDYAFPSFAAAKVFCLRDEVIRI 
QKELERQEALTSVATMTLMRWPFDFRYAILFLTDKGVSKGKALDRVVNILYDGKKPFVMA 
SGDDANDLDLI ERGDFK IVMSSAPEEMHVHADFLAP PADK.NC I LSAWEAGVRYYDDLMSL 

CPn_0408 454090 454591 

CT102 hypothetical protein 

-p VL1(M — r. - — * > ^r-'^r' - -'irir^' "Mp". r V r -' w f , . —rp'-j 

k;vwaki ■«■—•/•, ; it. ■ >!■ :.:Sf. 1 v .'NK- ". ■ * "'-"h :t:,lt 

EFPPDTDINhLujLTJnK^jJjJATJDt-LoK^iw^l-KKvVTN^L 

CPn_0409 454645 455127 

CT260 hypothec ical protein 

MTTWTLNQNNLT KF LKSSD EEP FLER ESG LTY I N I Q ANGNEL PLFFV I RS EG E t LQL ICY 
L PYQLHES HKASTARLLHLLNRD I D I PGFGMD EEQG L I FYRLVL PCLNGE I H DTLLR I Y I 
DT IKLVCDSFSHAIGL I SSGNMNLDELRRQALQEQQEKRNE 

CPn_0410 455087 455333 

dnaQ-DNA Pol III Epsilon Cham 

DVRLFKSNKKNVMSSQTMDVLI FYDTETTGTQ I ERDRI I EI AAYNSVTDESFLTYVNPEI 
PI PDEASKIHGITTDAVLSAPKFPEAYEGFRKFCGEDS ILVAHNNDGFDFPLLGKECRRH 
SLEPLTNRTIDSLKWAQKYRPDLPKHNLQYLRQVYGFAENQAHRALDDVVILHKVFTSLI 
GDLPPC^VLDLLCX3SYHPKVFKMPFGKYKG0P^VDIPKSYFEWLEN<XAI^KPENKDIKA 
AIALLHQPT 

CPn_0411 455794 456609 

CT262 hypothetical protein 

RHQS RYS S ITSTDM I LTAAFS PC PND I FL FRS FLKD PQF R PLLNQVT I AD I ETLNTLALQ 
RRLSLMKMSAALF PLVSDYYNLMDVGNTLGYNSG P I VLS LDPECSLDTLATPGEMTTAHA 
I/TKLYYPKAiO.IPMPYDKILSAII^KVDGGALIHEERFSYDLOLTLRACrc 
FPLPLGCLAIAKWPf^TVDALTAALRJCSLICSIiKDPITAGAKAVEYSKN^^ 
GTYINKETFQLSKTGKKALHMLWKANECCOYT 

CPn_0412 456515 457246 

CT263 hypothetical protein 

EPISTKKPFNYljKLGKKLYICSGRPMNAVNTPKKILCIVADYREISPLIEQLDFTQINEH 
L Y SYRCTDYH LDLY I VHVWG S T AVLNALQ S YCQ AYT DYD LW I N PG FVGAC S P E I P LGQC Y 
T I EKIANLTTDTPPVLSEDPPYI FDALPDSLPKSSLVTSPVLYHYGFHKTFKLLCMEGYA 
I ASQAAEHH I PCSFLKITSDYTVPGDCPFSPiEEVSQKLTCTLVELLPELMERAI PPKLL 
LPCP 

CPn_0413 459209 457227 

msbA -Trans port ATP Binding Protein 
VFMKIJ^LKAVIJ^KNKLVILGCSI 

GKLVKVS ELSQKD I LENWQAI SKDS ETLTVSDATTY I AEHGKSTASLT SKLS KFVRNYI D 
VSRFRGLAI FLICVAI FKAVTLFFQRFLGQWAIRVSRDLRQDYFKALQQLPMTFFHDHD 
IGNLSNRVMTDSASIALAVNSLMINY IQAPITF ILTLGVCLS I SWKFS ILICVAFP IFIL 
PIWIAPJCIKKLAKRIQKSQDSFSSVLYDFIAGVMTVKVFRTEKFAI^KYCEa^^ 
EEKSAAYGLLPRPLLHTIASLFFAFVWIGIYKFAI PPEELIVFCGLLYLIYDPIKKFGD 
ENTS IMRGC AAAERFYEVLNHPDLHSQKERE I E FLGLSNT IT FENVSFGYQEDKH I LKNL 
SFTLHKGEAIjGIVGPTGSGKTTLVKLLPRLYEVSG^KILIDSLPITEYNKGSLRNHIACV 
LQNPFLFYITrVWNNLTCGKDMEEEAVLEALKRAYADEFI LKLPKGVHSVLEESGKNLSGG 
QQQRLAIARALLKNAS ILILDEATSALDAISENYIKNI IGELKGQCTQI I IAHKLTTLEH 
VDRVLY I ENGQK I AEGT K E ELLQTC P E F LKMWEL SGTK EYNRVFVPDHK LVANPTDMA I T 
T 

CPn_0414 460203 459172 

accA-AcCoA Carboxylase/Transferase Alpha 

LC LRI VC I KM I LF I RGEH I LMELLPHEKQWEY EKA I AEFKE KNKKNS LLS S SEIQKLEK 
RLDKLKEKI YSDLTPWERVQ ICRHPSRPRTVNY IEGMCEEFVELCGDRTFRDDPAWGGF 
VKIQGQRFVLIGQEKGCDTASRLHRNFGMLCPEGFRKALPXGKLAEKFGLPVVFLVOTPG 
AYPGLTAEERGQGWAIAKNLFELSRLATPVT IWIGEGCSGGALGMAVGDSVAMLEHSYY 
SV ISPEGCAS ILWKDPKKNSEAASMLKMHGENLKQFG 1 1 DTVIKEP IGGAHHDPALVYSN 
VREFIIQEWLRLKDLAIEELLEKRYEKFRSIGLYETTSESGPEA 

CPn_0415 461522 460221 

CT266 hypothetical protein 

SQTGFLPGLTLI FVI I IVWCNAFLIKLCVIMGLOSRLOHCIEVSQNSNFDSQVKQFIYAC 
QDKTLRQSVLKIFRYHPLLKIHDIARAVYLLMALEEGEDLGLSFLNVQQYPSGAVELFSC 
GGFPWKGLPYPAEHAEFGLLLLQIAEFYEESQAYVSKMSHFQQALFDHQGSVFPSLWSQE 
NSRLLKEKTTLSQSFLFQLGMOIHPEYSLEDPALGFWMORTRSSSAFVAASGCQSSLGAY 
SSGDVGVIAYGPCSGDISDCYYFGCCGIAKEFVCQKSHOTTEISFLTSTGKPHPRNTGFS 
YLRDSYVHLP IRCKIT ISDKQYRVHAALAEATSAMTFS I FCKGKNCQWDGPRLRSCSLD 
SYKGPGNDIMILGENDAINIVSASPYMEIFALQGKEKFWMADFLINIPYKEEGVMLIFEK 
KVTSEKGRFFTKMN 

CPn_04l6 461871 461557 

himD/ihfA-Integrat ion Host Factor Alpha 

EALSNMATMTKKKLISTrSQDHKIHPflHVRTVIQNFLDKMTDALVKGDRLEFRDFGVLQV 
VERK PKVGRNPKNAAV P I H I PARRAVKFTPGKRMKRL I ET PNKH S 

CPn_0417 463047 462244 

amiA-N-Acety Imuramoyl Alanine Amidase 

REKGMKLTKYLNTKQLRSMISRLFVRYSLFMSKQLSFFALCVLGSHPIFAQTPNPPQRVR 
R.*JEVIFIDrcHGGKDQGTASKELHYEEXSLTLSLALTVQ3YLKRMGYKP0LTRS3DVYVD 
LGKRVALGNRGQf3DVF IS IHCNHSSKAAAFGTEVYFYNGPr/GSPTRNRMSEVLGKNILAA 
MEKNC I LKS RGLKTANFW I RDTSMPAVLVETGFLSNSP ERAALQDARYRMH VAKG I AEG 
VHNFL^GPSFOKPKONIAKIRKPOIQAN 

^PnjMlH 4M4CU 4 62^53 

mitrt: M AftTy tiHiiL.imoy M Liny lg hit jmy 1 DAP LLQ.i'.f- 

M LJ L K K i . 1 .1 IC. VO A K I Yi I K VR P L C VRN LT R D S RC V SVG DIF IAHKGQP iTJCTNOFAVDA LANG 
ArAt X L\ NP [•*[,.: WOt rTPMLEELEAELSAKYYEYPS^KLHTIGVTGTNGKTTVTCLr 

kai .i.i r ;yok [■■ :t ,i ,i * 7v r i-i 1 1 u^eicjv r y dc rrr pt pa l UjY'c l at m^/ pun r da v/m e v ^ s i 

^I,A::(;f'VAY'['N!-[;rAVI.TNlT[.DHLDFHOTFETYVAAKA.KLFSLVPF'S(":MVVrirrDSPYA 
;yt r E' iAKAi^V [TYi : f r.SAAUYPATL rOL.'J.'IGTKYTLV/GD^K t A' "'IJSF [GKYNVYNL 

i "A a r -;tv! i a i .m *n[.i :i n.i.\ k icuvjpffr ^rldpvlmgpc^vt^ t dyai {tpdale^nvltgl 

lll'.1 .1.1 l/V'.Ul. [ VVI . \ t ,\ m<| jH';K l'KU^AOWERYGFA , /*/T*iDNt'Pf;F.PPKL IVNF ICDG 
FY :KNYF t R L I i\* KO-\ ! TYAL: : I K'AMj I Vt. I AGKGHEAYO I FKH<yt"/At-'DDKOTV< 'KVLA 
.t'V 
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CPn_0'lL'.> '166897 464876 

pbpj - t ransglycoiase/ transpeptidase 

QLFPNTN rWN I PQK KVSV FY PM SY RK R3T L I VLGVF ALY ALLVLRYYK IQ I C EGDHWAAE 
ALGQ H E FCVR D P FR RGT FF ANTTVR KGDK DLQQ P FAVD I TKFH LCAD P LA I P ECHRDE 1 1 
QGrLQFIEGQTYDDLGLKLDKKSRYCKLYPLLDVSVHDRLSLWWKGYATKHRLPTNALFF 
ITDYQRSYPFCKLLGQVLHTLREIKDEKTGKAFPTGGMEAYFNHILEGDVGERKLLRSPL 
NRLDTNRVTKLPK[X^GDrYLTINPVrOTIAEEEX*ERGVLEAKAOGGRLIU1NSCTGEILA 

i.AOVE'FFOtTr.'vriJi : "jfj'"'-r- \ v" v r :v vz-e-g umkf: ;v - - ' •l^v.teaglfsc 
yy i i-tteep rr \"v\ j "?\ .vx* \\ j r ' : :> >y r ^i ■( m nmymaiok . .frr~/Acr-\op c u^r^ 

VAWYQQKLLAIXIFGRKTGIELPSEASGLVPSPHRFHINGSLEWSLSTPYSLAMGYNILAT 
G IQMVQAYA ILANGGYAVRPTLVKK I VS ASG EEYHLPTK EKTRLFS EE I TREWRAMRFT 
TLPGGSGFRASPKHHSSAGKTGTTEKMIHGKYDKRRHIASFIGFTPVESSEGNFPPLVKL 
VS I DD P EYGLRADGTKNYMGGRCAAP I FS RVADRTLLYLG I L PDKKLRNCDEEAAALKRL 
YEEWNRSPKQGGTR 

CPn_0420 467120 466824 

CT271 hypothetical protein 

KS F PMNKS RFLRLCCC LC FCGS LFYFY INKQNSLTKLRLEI PCLSVRLRQLEQQN I SLRF 
L I DKI ERPDHLME IAALPEYQYLEYPS EES I SLLSYEL P 

CPn_0421 468007 467108 

yabC-PBP2B Family methyltransf erase 

E I LMS E RAH I PVLVEECLALFAQ RP P GT F RDVTLGAGG H AYAFLEAYP S LTC YDG S D RDL 
QAIAIAE^LETFQDRVSFSHASFEDLANQPTPRLYDGVLADLGVSSMQLDTLSRGFSFQ 
GEKEELDMRMDQTQELSASDVLNSLKEEELGR I FREYGEEPQWKSAAKAWHFRKHKKIL 
S IQDVKEALLGVFPHYRFHRKI HPLTLI FQALRVYVNGEDRQLKSLLTSAI SWLAPQGRL 
VIISFCSSEDRPVKV^FKEAEASGLGKVITKKVIQFTYQEVRRNPRSRSAKLRCFEKASQ 

CPn_0422 463233 468784 

CT273 hypothetical protein 

GLAMVE I FNYSTS IYEQHASNNRIVSDFRKEIQMEG I S I RDVAKHAQ I LDMNPKPS ALTS 

LLC^I^QKSHWACFSPPNNFTKQRFSTPYLAPSLGSPDC^DEDIEKISSFLKVLTRGKFSY 

RSQITPFLSYKI)KEEEEDEDPEEDDDDPRVQCGKVIXKAXJ:LGVICSTNW 

FVQG 

CPn_0423 468788 469216 

CT274 hypothetical protein 

CMLDNEWKA I LGWG DDELEELR I SGY S FL RQGHY S KA I L F FEALVI LD P L3 I YDHQTLGG 
LYLQ IGENS QALAVLDQAL RMQG DHL PTL LNKTKALFCLGR I EEAT A I ATYL S SC P I PA I 
ANPA^EALLMSYSKATKKNAALVR 

CPpwp424 469528 470961 

dn'aW-Replication Initiation Factor 

SRdSEl FSPSLMGWVDC IWESF INKESGMLTCNECTTWEQFLNYVKTRCSKTAFENWISP 
I Q VLE ETQ E K I RLEVPN I FVQNYLLDNYKRDLC S FVP LDVHG EPALEFVYAEHKKP SAPV 
ASg^SNEG ISEVFEETKDFELKLNLSYRFDNFI EGPSNQFVKSAAVG I AGKPGRSYNPL 
FiaGGVGLGKTHLLHAVGHYVREHHKNLRIHC ITTEAFINDLVYHLKSKSVDKMKNFYRS 
LI^tfLVDDIQFIXJNRQNFEEE^COTFETLINLSKQrVITSDKPPSQLKLSERIIARMEWG 
LVj&fVG I PDLETRVA I LQHKAEQKGLL I PNEMAFYI ADHIYGNVRQLEGAINKLTAYCRL 
FG^TETTVRETLKELFRSPTKQKISVETILKSVATWQVKI^LKGNSRSKDLVI^Q 
IAWJUCTLITDSLVAIGAAFGKra^ 

CPiHi:Q425 470965 471564 

CT2#S hypothetical proteins 

FRGCPMFRRTGKGPFEDVQTLYEEETSSPSSYSPYSRSERPETPPSLFDNPKASEARPLN 
HNtTEESS L PQWS ST P RT ES LL PLEEPETTLG EGVT FKG ELAFERLLR I DGTF EG I LVSK 
GKJ££GPKGVVKADIQI^EAIIEGWEGNITVSGKVEX 
I LGYLAI AGITDHSERERDL 

CPn._0426 472111 471536 

CT|7#? similarity 

MViFSLLFPKLCYGCQAPGAYFCSNCLEKLLVEDREGRCLHCFRYLGSSETRLCSQCSPS 
SQI»FSLYLPSQTALSVYARACEGKRPALQFFSKS IAFELASLDETPSCIAYITSTISR 
KIVYEVAKLEKLLRI PLWPWLPKKRQ I EKLPKGEG I CFLSAYPLSQKWMQT I VGGS ASPL 
VS liSEFLSQNDQ 

CPrii|©'427 472153 473715 

nqr2-NADH (Ubiquinone) Dehydrogenase 

AVCYVFERVEASTFLS ITMLKKFINSLWKLCQQDKYQRFTPIVDAIDTFCYEPIETPSKP 
PF I RDSVDVKRWMMLVVIALFPATFVAIWNSGLQSIVYSSGNPVLMEQFLHI SGFGSYLS 
FVYKEI H IVP I LWEGLKI F I PLLT I S YWGGTC EVLFAWRGHKI AEGLLVTG I LYPLTL 
PPTI PYWMAALG I AFG IWSKELFGGTGMNILNPALSGRAFLFFTFPAKMSGDVWVGSNP 
GV I K DS LMKMNS STGKVL I DG F SQSTCLQT LNSTP P SVKRLHVDA I AANMLH I PHVPTQD 
VI HSQFSLWTETHPGWVLDNLTLTQLQTFVTAPVAEGGLGLLPTQFDSAYAITDVIYG IG 
KFSAGNLFWGN I IGSLGETSTFACLLGA I FL I VTG I ASWRTMAAFG IGAFLTGWLFKF I S 
VLIVGONGAWAPARFFIPAYRQLFLGGLAFGLVFMATDPVSSPTMKLGKWIYGFFIGFMT 
I V I RL I N PAY P EGVMLA I LLGNVFAP L I DY FA VR KYR KRG V 

CPn_0428 473719 474681 

"nqr3 -NADH (Ubiquinone) Oxidoreductase, Gamma" 

NMS KGS S K HTVR I NQTWY I VS F I LG L S L F AGVLL ST I YYVLS P I Q EQAAT F DRNKQMLLA 
AHILDFKGRFQIQEKKEVWPATFDKKTQLLEVATKKVSEVSYPELELYAERFVRPLLTDA 
QGKVFSFEEKNLNPIEFFEKYQESPPCOOSPLPFYVILENTSRTENMSGADVAKDLSTVQ 
ALIFP I SGFGLWGPIHGYLGVKNIXJDTVLGTAWYQQOET PGLGAN ITNPEWQEQFYGKKI 
FLQDS3CTTNFATTDLGLEWKGSVRTTLGDS PKAL3AI DG I SGATLTCNGVTEAYVQSL 
ACYRQLLINF5NLTHEKKTGE 

t:pn_042'J 474666 475310 

nqr4 -NADH (Ubiquinone) Reductase 4 

K ENPRMTEKKGY KSYFFDPLWnNNQ I L I -\ I LG IC.^ALAVTTTVQTA ITMG I AVS I VTGGS 
. L ;FFV:;r.LRKFTPDSVRMrTQLI I liJLFV [VTDOrLKAFFFDI^KTL^VFVGL I [TNCIVM 
< :i<. r :ii:;LARHVTP [PAFLDGFASGU;YOWVLLVIGVr PCLrOFOTLMGFR I IPOFVYAJET 
HPn/iYyr-JL: ;i iMVLiAPnAFFLLi^j IM [WLVN [ RD'^KKPKR 

:rru_ f )4 jo -t ji 1 :t o-i t 

inii'j MADII ([/l>Li|u mono) Rotiti. 't S 

FMWI/JAY'l'WI^R'ilLLyAAFEONILLANrLC^MC J /TJWJ'JTRV'rrANi^UIM^VALVLTVT 

';:;inwFViiAK['i'f ;pKAL'iviESP3LA:;vN'LGrLCL>r r f [ vv [ AAPro 1 lclllekv:;rnly 
1 ; [ / ; r f i a ■ t , 1 a vnc a n ,c a : v lfg 1 t r ;i y p r 1 pmm [ f • u : ac ;r< -,vmi , a 1 v 1 lat r k e k la y 
'. : d n r r 1 1 ,(> ; Mf ; r : : [• vvr<: 1 , t am afmc ltg i d i . * k 1 • ; a k r ijk a y l v cw enttn p l k E;7 s 
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^KHOPPrSKAPTWRR.;L 

CPn_04Jl 4764 47f,15L 

No robust homo log present in Genetvink/EMBL as ot 11/7/98 

K IMTTLFKYVPRSRQNPDTLTFLKR YSSVLLHSENS LSYR I FAKVLA I LLTS LAVAFAVT 

L FSC EG SQ LRLCALY IG I ALAI CVL LT I WYC I A J K T ATAC KK PPSISRIEIV 

CPn.04 32 4763^7 476514 

-■".r •MDiv't* \ :. \. \ :n\ ' - r:\-v- • - \ :.-:[, w 

GLFLTGSAPL^LJ IWIAA.^. ITliJMLVlA^WK^ K i^NALLK'I'KVAriES 

CPn_0433 477327 476929 

gcsH-Glycine Cleavage System H Protein 

RTFRILYGTLYRTGSRKN^WSDYHWILPVHERWRLGLTEKMQKNLGAILHVDLPSV'G 
SLCKEGEVLVI LESSKSAI EVLSPVSGEVI D INLDLVDNPQK INEAPEGEGWLAWRLDQ 
DWDPSNLSLMDEE 

CPTU0434 479471 477276 

CT283 hypothetical protein 

RPWVR I YQQDLFCRLCRDPAWFFSLLSFTLRFYCLGRGWTLLSFFYKHQKKF IG IVIAW 

CVSGrGVGWGRFSRKGSAESTSRRTVFTTASGKRYVEKDFMAMKKFFAHEAYPFTGNPRA 

WNFINEGULTDYFLTTRVGEKLFUCVYHPGEKIFSKEKAYQPYRRFDAPFISSEEVWKSS 

APQLLE I LKVFQQI ENP I SKEG FLARAXLFLEERRF PHYVLRQMLEYRRQMFALPPDEAL 

SRGKDLRLFGYQTIQEWFGDAYLSAAVELLIRFIDEQKKVLPRPSKQEAJU3DFYDKAKHA 

YTKISKNKEFSLGFEEFWSYFQFLEISESEFFNMYRDILJ^KRALLXiiCGGVSFDFQPL 

TTFFVCGKDS IQ\^FFRLPKEYSFKTKQELKAFEVYLKLVSLPKSDSLDVPNEILP I ATI 

KAKEPRLVGRRFS IDYKRVALQDLAA WPMVEVLKWQQNS EHFQE I LQQFPDVETCQSYK 

DF0HLKPALRDKISLFTRKEILRARPERILQSLQQVPKQSQEVLJ1<SAGKNSAL.PGISDGQ 

QLAKVLX^EVLDLYSQDAETYYTIIVNSSFEKEEVLPYREVIJ^ 

ME RLE S ALRTRY PG EEGAS LWQ RPiWKVVENHRLGRH L EG S FSWS LDR S LKT F S RGDKE L 

PQEFDR I FSMKVGDYSSVFMSPNEGPCYYCCLSHLLYDRPASVDKLFLAKSQLDEELLGS 

YMERFIEQGWR 

CPn_0435 480908 479475 

Phospho lipase D super family [uncleavable leader peptide] 
GVMMSRIJIFRIJkAIjSIFFIIXVPNSVSA 

ANFYVELCPCMTGGRTLKEMVDHLEARMDLVPELCSYI I IOPTFTDAEDQKLLKALKERH 
PN^FYVFTGCP?STSIIJ\PWIEMHIKLSIIDGKYCIIXX^NTEEFMCTPGDEW 
NPRLFVSGVRRP LAF RDQD I MLRSTA FGL Q LR EEYH KQF AMWDYYAHHMWF I DNP EQF AG 
AC PPLTLEQAEETVFPGFDKHEDLVLVDSSKI RIVLGGPHDKQPNPVTQEYLKLIQGARS 
SVKLiATOfX'FIPECDEIi.NALVDVSHNHGVHLSLITNGCHELSPAITG 
YGKRYPLWKKWFCEKLKPYERVS IYEFAIWETQLHKKCMI IDDEIFVIGSYNFGKKSDAF 
DYESrWIESPEVAAKANKVFNKDIGLSI PVSHGDI FSWYFHSVHHTLGHLQLTYMPA 

CPn_0436 481633 480902 

lplA-Lipoate Protein Ligase-Like Protein 

FYVCYMKVR TVDSGKS SAAS HMAKDRDLLESLQDGEL I LHLYEWEN PCSLTYGHFMRPEK 
FLLSNY ADLGLDAAVR PTGGGFVF HKGDY AF S VLMS AT H PSY S S SVLENYHTVNS FVAKV 
LEKVPRIQGMIJVPEDENSSSRDSGNFCMAKTSKYDVLFGDKK IGGAAQRKVOQGFLHQGS 
LFLSGSSSEFYQRFLKPEVLEE IIEQIQI HAFFPLGLEAADEVLQEARQQVKEAF I KLFC 
GEGL 

CPn_0437 481810 484350 

clpC-ClpC Protease 
VFMFEKFTNRAKQVIKIJ^EAQRIJ^ 

ARQEVEPXIGYGPEIQVYGDPALTGRVKKSFESANEEASLLEHNYVGTEHLLLGILHQSD 

SVAIJ3VLENL^IDPR£VRKEILRELETFNLQLPPSSSSSSSSSRSNPSSSKSPLGHSLGS 

DKNEPXSALKAYGYDLTEMVRESKLDPVIGRSSEVERilLILCRRRKNNPVLIGEAGVGK 

TAIVEGLAQKIILNEVPDALRKKRLITLDLALMIAGTKYRGQFEERIKAVMDEVRKH 

LLFIDELHTIVGAGAAEGAIDASNILKPALARGEIC^IGATTIDEYRKHIEKDAALERRF 

QKIWHPPSVDETIEIIJIGLiCKKYEEHHNWITEEALKAAATLSDQYVHGRFLPDKAIDL 

L^EAGARVRVNTMGQPTDLMKLEAEIEOTKLAKEQAIGTQEYEKAAGLRDEE^ 

SMKQEWENHKEEHQVPVDEEAVAQWSLQTG I PSARLTEAESEKLLXLEDTLRRKVIGQN 

DAVTSICRAIRRSRTGIKDPmPTGSFLFLGPTWGKSLI^QQIAIEMFGGEDAI.IQVDM 

S EYMEK FAATKMMGS P PGYVGH EEGGHLTEQVRRR PYCWLF DE I EKAHPDIMDLMLQ I L 

EQGRLTDSFGRKVDFRHAI I IMTSNLGADL I RKSG E IG FGLKSHMDYKV I QEKI EHAMKK 

HLKPEF INRLDESVI FRPLEKESLSE I IHLEINKLDSRLKNYQMALNI PDSVISFLVTKG 

HS PEMG ARPLRRVI EQYLEDPLAELLLKESCRQEARKLRATLVENRVAFEREEEEQEAAL 

PSPHLES 

CPn_0433 485455 484334 

ycbF-PP-loop superfamily ATPase 

NLTLPMP PQVR E IMQQTVI VAMSGGVDSS WAYLFKKFTNYKVIGLFMKNWEEDSEGGLC 
SSTKDYEDVERVCLQLDIPYYTVSFAKEYRERVFARFLKEYSLGYTPNPDILCNREIKFD 
LLQKKVOELGGDYLATGHYCRLOTELQETQLLRGCDPQKDQSYFLSGTPKSALHNVLFPL 
GEMNKTE-/RAIAAOAALPTAEKKDSTG ICF IGKRPFKEFLEKFLPNKTGNVI DWDTKEIV 
GQHQGAHYYT I GQRRGLDLGGS EKPC YWGKNI EENS IY I VP.GEDH PQLYLRELTARELN 
WFTPPKCGCHCSAKVRYRSPDEACTIDYSSGDEVKVRFSQPVKAVTPGQTIAFYQGDTCL 
GSGVIDVPMIPSEG 

CPn_04ri 485523 486077 

No robust homo log present in Genebank/EMBL as o£ 11/7/98 
r ISSNMP'/LFVSSTLNGVFPSSLPEESADLFITNKEIVALGEKGNVFLTHSI PMHIAAIT 
I LV IVALAG IAI ICLGCYSOS r LLIAVG IVLT I LTLLCLQALVGFI KF I ROLPQQLHTTV 
OF IREK I RPESSLQLVTNAQRKTTQDTLKLYEELCDLSQKEFKLQSTLYQKRFELSHKNE 
KTNQN 

CPn_044Q ASkOHI 4 8^74 0 

rio rob-.-r homoLo^ pi.is^-nr in Ocnebank/ EMEL or> of 11,'7/or 
LAT [PG*<?JMAT'1VAPSPVPE i j';pLf;HATEVLMLPMAY [TQPHP IPAAPWETFRSKLSTKH 
TUTALrLLLTUJGTr :AGY-V;YTr'.NWT [CGrGLGT rvr.Tr.ILALLLArPLKNKOTGTKL 
[DH [i>0M"S [G:'(jFV0RY(.7LMF'';T I K: "VHLl'ELTTONQEKTP TLNEI EAKKE3 TQNLEL 

KrTE(cr,KLAOKOPKRK:J:*OK';rMP::rKiit.::KNPvrLrrx; 

f i'n_0 4 4 I AH IH'.H 

'TOO/ r /por ht^r n . 1 1 [u or«. in 

I'WK I LOI'MFKLLl'll [ AAl*A«.iUVL'7Pt' f (' [V(jI^Af Y ; tlJ[ , .EA ( '.j'NPPI , PPF:;AOVf jYLKVND 
AKFKKLr-H'JTIGYROYtt rTFLTTLiTTni: If Jt .[.FmvJY [GAD KjWKr/.lLl I.', [-TTDPNCIjG 
WATI^jLT-FYNYVLL^UlAYTLLLKNWf.JW:: I I U/ir.vrJi 'KN rEMGYGLYOGVL'JfJKYQAT 
KKI/^A 1 r GV T N ETC jLHQ R K AW [* f J 'A IV : ; Y K A' ITXj [ ,' r Lt J( ' I YF j VMF:j tDYR''TSV( 'NLGLAY 
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R LT R F R K K r , H KNHL 1 3 I PG I F E Y V . H E I CANVK LTPWPG3F IKGFY r ;W3 [GNDISI ADDH 
NNNKT3HTFKT3AFFGG3AVMNF 

CPn_0442 1H79'>^ 483523 

CTOOb hypothetical protein 

MI LFQTDMCFKN rCKQCJSQLYUKTFPERILARKLKNCAKSYPRTALTI EVLVSSVLGAL 
KVIL [ PCA3TYAALTLPLRALFNAIKTKSCQHLA3YAMAWLLH ILT I AV r IGLVFSLVF I 
PPPWFrSLGLLMSVTTS^LFOYHKNLFPPYEPPPSRPHTPPPFADEYVPLrSESYFD 

■■i:,..-*y5 1 / I,-- ■ 

CT0U5 hypothetical protein 

VDSMSOP PINPLGOPQVPAAAS PSGQ PSWKRLKTSSTGLFKRF IT I PDKYPKMRYVYDT 
GI IALAAIAILS ILLTASGNSU^YAIAPAI^LGALGVTLLISDILDSPKAKKIGEAITA 
I WP 1 1 VLAIAAGLI AGAFVASSGTMLVTANPMFVMGLITVGLYFMSLNKLTLDYFRREH 
LLRMEKKTQ ET AEP I L VT PSADDAKK I AVEKKKDLS ASARMEEHEASORQDARKRR rGRE 
ACjGSFFYSSRNPEHRRSFGSLSRFKTKPSDAASTRPASISPPFKDDFQPYHFKDLRSSSF 
GSGASSAFTPIMPASSRS PNFSTGTVLHPEPVYPKGGKEPS I PRVSSSSRRS PRDRQDKQ 
QOQQNQDEEQKOQ SKKK3GKSNQS LKT P P PDGKSTANLS PSN P FSDGYDEREKRKHRKNK 

CPn_0444 490266 494507 

pmp_6- Polymorphic Outer Membrane Protein 

KAF PQRHMKYSLPWLLTSSALVFSLH PLMAANTDLSSSDNYENGSSGSAAFTAKETSDAS 
GTTYTLTSDVS ITNVSAITPADK3CFTNTGGALSFVGADHSLVLQT IALTHDGAAINNTN 
*TALSFSGFSSIjLIDSAPATGTSGGKGAICVTOTEGGTATFTDNASVTI^KNTSEKIX3AAV 
SAYS I DLAKTTTAALLDQNTSTKNGG ALCSTANTTVQGNSGTVTFSSNTATDKGGG IYSK 
EK DSTL DANTGWT FK SOT AKTGG AWS S DDNI^TGNTQ VLFQENKTTG S AAQANNPEGC 
GGA ICC YLATATDKTG LA I SQNQEMS FTSNTTTANGGAI YATKCTLDGNTTLTFDQNTAT 
AGCGGAIYTETEDFSLKGSTGTVTFSTNTAKTGGALYSKGNSSLTGNTNIJjFSGNKATGP 
SNSSANQEGCGGAILAF IDSGSVSDKTGLS IANNQEVSLTSNAATVSGGAIYATKCTLTG 

ngsltfdgotagtsggaiytetedftltgstgtvtfstktaktg<salyskg^si^gi™ 
llfsgnkatgpsnssanqegcggai lsflesasvstkkglwi ednenvslsgntatvsgg 

AIYATKCALHGNTTLTFDGNTAETAGGAIYTETEDFTLTGST^^ 

KGOTSFTKNKALVTSGNSATATATTTTDQEGCGGAILCWISESDIATKSLTLTENESLSF 

I NNT AK R SGGG I Y APKCV ISGSESINFDG NT AET SGG AI Y SKKLS I T ANGPVS FTNNSGG 

KGGAIYIADSGELSLEAIDGDITFSGNRATEOTSTPNSIHIiGAGAKITKIAAAPGHTIYF 

YDPITMEAPASGGTIEEXVINPVVKAIVPPPQPKNGPIASVPV^ 

GKLPSQDASIPANTTTIL^QKINIJiiG^^NW 

TTT^n'rcSIDLKNLSVNLDALDGKRMITIAVNSTSGGLKISGDUCFHNNTC 
KANLNLPFLDLSSTSGTVNLDDFNPIPSSMAAP^ 

ALGYT PKPELRATLVPNSLWNAYVN I HS ZQQE I ATAMSDAPSHPG IWIGGIGNAFHQDKQ 
KE^GFRLISRGYIVGGSMTTPQEYTFAVAFSQLFGKSKDYWSDIKSQVYAGSLCAQSS 
Y^£L$LH SSL RRHVL S KVL P ELPG ET PLVL KGQVSYG RNHHNMTTKLANNTQG KSDWDS H S 
FA\?£VGGS LPVDLNYRYLT S YS PYVKLCWSVNQKG FQEVAADPRI FDASHLVNVS I PMG 
LTM^ESAKPPSALIXTLGYAVDAYRDHPHCLTSLm^SWSTFATNLSRQAFFAEASGH 
LK^HGLDCFASGSCELRSSSRSYNANCGTRYSF 

CPJUP445 494739 497579 

pm|iN7 -Polymorphic Outer Membrane Protein 

Ft^VSKKCLQMKSSVSWLFFSS I PLFSSLS I VAAEVTLDSSNNSYDGSNGTTFTVFSTT 
DA|0SjTTYS LLS DVS FQNAGALG I PLASGCFLEAGGDLTFQGNQHALKFAFINAGS SAGT 
VAp^AADK^LFNDFSRLSIISCPSIJXSPTGOCALKSVGNLSLTGNSQI IFTQNFSSD 
NGGVJNTKNFLL SGTSQFASFSRKQAFTG KQGGWY ATGT IT I ENS PG I VSFSQNLAKG S 
GGi^YSTDNCS ITDNFQV I FTCNS AWEAAQAQGGAI CCTTTDKTVTLTCNKNLSFTNNTA 
LT^SGAI SGLKVS ISAGGPTLFQSNISGS SAGQGGGGAINIASAGELALSATSGDITFNN 
NQS^TNGSTSTRNAINI IOTAJ<VTSIRAATGQSIYFYDPITNPGTAASTE>TLNLM J ADANS 
EI EYGGAIVFSGEKLSPTEKAI AANVTST IRQPAVLARGDLVXJtDGVTVTFKDLTQSPGS 
RILMDGCOTLSAXEA^SLNGIAV^SSLDGTNKAAL SLSGTI ALIDTEGS 

FY^^LKSASTYPLJ^ELTTAGAI^TITLGALSTLTLQEPETHYGYOG^QLSWANATSS 
KISSiNWTRTGYIPSPERKSNLPLNSLWGNFIDIRSINQLIETKSSGEPFERELWLSGIA 
NF |3£RD SMPTRHGFRH I SGGYALG I T ATT PAEDQLT F AFCQLF ARDRNH ITGKKHGDTYG 
ASLYFKHTEGLFDI ANFLWGKATRAPWVLSEISQI I PLSFDAKFSYLHTDNHMKTYYTDN 
S I iESS3SWRNDAFCADLGASLPFVISVPYLLKEVEPFVKVQYI YAHQQDFYERHAEGRAFN 
KS EL INVE I P IGVTF ERDS KS EKGTYDLTLMY I L DAYRRNPKCQTS L I ASDANWMAYGTN 
LA^@G FSVRAANH FQVNPHME I FGQFAF EVRS S SRNYNTNLG SKFC F 

CPfip446 497602 500415 

pmp~$- Polymorphic Outer Membrane Protein 

LIEIEKHLSMKIPLHKLLISSTLVTPILLSIATYGADASLSPTDSFDGAGGSTFTPKSTAD 
ANGTNYVLSGNVY INDAGKGTALTGCCFT ETTGDLTFTGKGYSFSFNTVDAGSKAGAAAS 
TTADKALTFTGFSNLS F I AAPGTTVASGKSTLSSAGALNLTD^IGTILFSQ^A/SNEA^I^ING 
GAITTKTLS I SGNTSS ITFTSNSAKKLGGAI YSSAAAS I SGNTGQLVFMNNKGETGGGAL 
GFEASSSITQNSSLFFSGNTATDAAGKGGAIYCEKTGETPTLTISGNKSLTFAENSSVTQ 
GGAICAHGLDLSAAGPTLFSNNRCGNTAAGKGGAIAIADSGSLSLSANQGDITFLGNTLT 
STSAPTSTRNAIYLGSSAKITNLRAAQGQSIYFYDPIASNTTGASDVLTINQPDSNSPLD 
YSGT IVFSGEKLSADEAKAADNFTS I LKQ PLALASGTLALKGNVELDVNGFTOTEGSTLL 
MQPGTKLKADTEAISLTKLWDLSALEGNKSVS I ETAGANKT ITLTSPLVFQDSSGNFYE 
S HT INQAFTQPLWFTAATAASDI Y I DALLTS PVQT PEPHYGYOGHWEATWADTSTAKSG 
TMTWVTTGYNPNPERRAPWPDSLWASFTDIRTLQQ IMTSQANS I YQQRCLWASGTANFF 
HKDKSGTNQAFRHKSYGYIVGGSAEDFSENIFSVAFCQLFGKDKDLFIVENTSHNYLASL 
YLQHRAFLGGLPMPSFGS rTDMLKD I PLI LNAQL3 Y3 YTKNDMDTRYTSYPEAQGSWTNN 
SGALELaGGLALYLPKEAPFFOGYFPFLKFOAVYSRQQNFKESGAEARAFDDGDLVNCSI 
P VC I R L EK 1 3 EDEKNN F C I S LAY I G D VYR KN P R SRT 3 LMVSG AS WT S LC KNLARQAFLAS 
AG S H LT LS P HVELSGEAAY EL RG S AH I YNVDCG L RY 3 F 

OPn_04 4 7 5005-4 1 503 351 

pmp_> -Po Lymorphic Outer Membrane Protein 

^KPPIALYMKSSLHWFLISSSI^LPLSLNFSAFAAWEIMLGreNSFSGPGTYTPPAQT 
TNADGT T YNLTGDV3 [TNAGSPTALTASCFKETTCNLSFQCHGYQFLLQNIDAGANCTFT 
NTAANKLL3 F3CF3YL: ' L I SJTTN ATTGTG A [ KSTGACS IQ3MY5C Y FCONFSNDNGGALQ 
G33 r 3L.3 LN PNLTFAKNKATQKGCALY3T( Xj IT rMiTLN3A3F3ENTAANNCGAIYTEAS 
'.'A' L33NK A I3F fNN3VT\T3ATGGA IY033T3APKPVLTL3DNOCLNF IGNTAITSGGAI 
YTDNL7L3: ;(X IPTI .FKNN3A [ DTAAPLGGA [ A [ AD3G3L3L3ALGGD [TFECNTWKGAS 

:;::(jrnTW-i:; [nigntnak r vo l rasognt r y r y d p i tt3 [taaljdalnlngpdlagnpa 

Y0 f V!\ r VFr;(;F.KI J Sl':Ar.A\KADNLK.lT[<^PLT[.AfXJ0LJLK3GV'rLVAK3F3Q3PG3TLL 
M I )U T IT f ,Kl 'A t X ; [ T I NNI ,V I ,NV P / LK ! 7 T K K /\ T I . K ATO A3 rj r/T L: \ \ : : I L VD P3GMVY ED 
7:;WNf If V/I-'lX'LTt.TAPDI'AN IH I rDlj\ADPLt^KMP i\\W, { ( J NWA L: "WQ ED" rATK 3KAA 
't'f /rWTKT j'YNPNI 'KPPi ;TLVANTLWC?JrVDVR3 [OOLVATKVRQ: 'OCTRG rWCEOTSNFF 
HKn.';TK[MKC[-KHl::Ai !YVVi'ATrr[.A - JDNLlT/\Ah' r OLr'GKDIiDHF [NKMRA3AYAASL 

^[. > oHLA , tu.:;:u'::[.^!^I'[Jx^;l::;□oPVLlTtAor';Y[Y^r^JTMK'!TY7VAPKGE:;:;wY^Ia;c 

Af AU .A3:;i ,1 ■tri'Al ;i ti:i *1 ,i 'I IAV FPF [ KVI^V :Y [ I \OU JFKFPf (TIT .VR:'FD3GDL TNV3VP 

K;i'rr-7-:i'K?:i^jK!vA:;vivv[VLYV\t)WRKNiMX"rrY'r,Lrrirrr::wKrrxrrNL3RQAGrGRA 



G I FYAFSPNLEVTTTILSMEI RGSSRrrVNADUVJKFCF 

CPn_0448 504376 5036'">8 

*yx:G_8s_2 Hypothetical Protein 

F IQPSRREI HEWKC ILLGS3LRMEMMS PFQQ P EQCH FDWG3 FLR P ESLTRARSDFEEGR 
IVYEC^RVVEDAAIPJiLIKKGTEAGLIFFTDGEFRRYSVroFDF>fUGFHGVDR 
TGVYLKDK I SVSKKPF I EH FEFVKTFEKGNAKAKCT I PS PSQ FFHEM I FAPNLKNTRKFY 
PTNQEL I DO IVFYYPQV I Q C LY AAGC RNLQ LD DC AWC R L LD [ R A PS WY GVDS H DRLQE I L 

; ;", • :;;a : ~ * ; " - '[>,;.' : Nil ' ' \ \ : ,\ ' a ' '■' .rvL^l.r-r'Z 

CG F ASC EGDH RMT EEEQ W K K I AFVK E I AK L IlvG 

CPn_0449 507231 505330 

pmp_10-PMP_10 r Frame -shi tz with 0451) 
E1AYTGFRGC<X3ISF3^IVQGTTAGNGGAISIL^ 

TTKRNS I DIGSTAKITNLRAISGHS I FFYDP I TANT AADSTDTLNLNKADAG NSTDY SGS 
I VF SG EKLS EDEAK7ADNLT S T LKQ PVTLT AG NLVLK RG VT L DTKG FTQT AG S S\ r I MDAG 
TTLKASTEEVTLTGLS I PVDS LG EG KKW I AA S AAS KNVAL SG P I L LLDNQGNAY ENH DL 
GKTQDFSFVQLSAU7TATTTDVPAVPTVAT 

WTNTGYLPNPE^OGPLVPNSLVTCSFSDIQAIC^yiERSAI.TIXSDRGFWAAGVANFLDKD 
KKGEKRKYRHKSGC^A IGGAAQTC SENLISFAFCQLFGS DKDF LVAKNHTDTY AGAFY I Q 
HITECSGFIGCLIXKLPGSWSHKPLVLEGQIAYSHVS^LKTKYTAYPEVXGSWGNN^ 
MMLGAS SHSYPEYLHCFDTYAP Y I KLNLTY I RQ DS FS EKGTEGRS FDDSNLFNLSLP IGV 
KFEKFSDCNDFSYDLTLSYVPDLIPJ^DPKCTTALVISGASWETYANNLARQALOVRAG 
YAFS PMFEVLGQP/rEVRGS SRI YNVDLGGKFQF 

CPn_0450 503121 507180 

pnip_10 -Polymorphic Outer Membrane Protein 

SGFMKSQFSWLVL^STLACFTSCSTVFAATA£NIGPSDSFDGST^^ 

TLTGDITI^NLGDSAALTKGCFSDTTESLSFAGKGYSLSFLNIKSSAEGAALSVTTDKNL 

SLTGFS SLTFLAAPSSVITT PSGKGAVKCGGDLTFDNNGT I LFKQD YC EENGGAI STKNL 

SLKNSTGSISFEGNKSSATGKKGGAICATGTVDITNOTAPTLFSNKIAEAAG^ 

CT I TGNTSLVFS ENSVTAT AGNGGALSGDADVT I SGNQSVTF SGNQAVANGGA I YAKKLT 

LASGGGGVS PFLT I 

CPru0451 508153 511058 

pmp_10-PMP_10 f Frame-shift with 0451 ) 

KTQRVK I KT LDSC FV I FNL I YLFCFY I DANS S LKNKSITMKTS I PWVLVSSVLAFSCHLQ 
SLANEEIiSPDDSFNG^IDSGTFTPKTSATTYSLTGDVFFYEPGKGTPLSDSCFKC/TTDN 
LTFIX!bK3HSLTFGFIDAGTHAGAAASTTANKNLTFSGFSI^SF^ 

AGGVNLENI RKLVVAGNFSTADGGAIKGAS FLLTGTSGDALFSNNS SSTKGG AI ATTAGA 
RIANNTGYVRFLSNIASTSGGAIDDEGTS IL3NNKFLYFEGNAAKTTGGAICNTKASGSP 
ELIISNNKTLIFASWAETSGGAIHAKKLALSSGGFTEFLRNWSSATPKGGAISIDASG 
ELSLSAETGNITFVRNTLTTTGSTDTPKRKAINIGSbKIKFTELR^ 

GTS S DVLK I NNGS AGALN PYQGT I LFSGETLTADELKVADNLKSS FTQ PVSLSGGKLLLQ 

KGVTLESTSFSQEAGSLLGMDSGTTLSTTAGSITITNIXSINVDSLGI^QPV 

KVrVSGKI^IDIBGNIYESHMFSHDQLFSLLKITVDADVIDTNVDISSL 

YGFQGQWNVMjTrrDTATNTKZATATWTKTGFVPSPERKSALVC^^ 

EIGATGMEHKQGFWVSSMTNFLHKTGDENRKGFRHTSGGYVIGGSAHTPKDDLFTFAFCH 

LFARDKlXrFIAHNNSRTYGGTLFFKHSHTI^PQWLRI^RAKFSESAIEKFPREI 

VQVSFSHSDhrRMETHYTSLPESEGSWSNECIAGGIGI^LPFVT.SNPHPLFKTFIPQMKVE 

MVWSQNSFFESSSDGRGFSIGRLIj^SIPVGAKFVO^DIGDSYTYDLSGFFVSEA^ 

PCSTATLVMS?DSV^IRGG^SRQAFLIJ^GSN>rYVYNSNCELFGHYAMEL^^ 

VGTKLRF 

CPn_0452 511304 512860 

pmp_l 2 -Polymorphic Outer Membrane Protein (truncated) 
FNEETMTILRNFLTCSALFLALPAAAQVVYL^ESDGYNGAINNKSLEPKITCYPEGTSYI 
FLDDVR I SNVKH DQED AGVF I NRSGNL FFMGNRCNFT F HNLMT EG FG AA I SNR VGDTTLT 
LSNFSYLAFTSAPLLPOG^GAIYSLGSVMIENSEEVTFCGOTSSWSGAAIYTPYLljGSKA 
SRPSVNLSG^YLVTRJDNVSCGYGGAISTHNLTLTTRGPSCFENblHAYHDVNSNGGAIAI 
APGGS I S I SVKSGDLI FKGNTASQDGNTI HNS I HLQSGAQFKNLRAVSESGVYFYDPISH 
SESHK ITDLVINAFEGKETYEGTI SFSGLCLDDHEVCAENLTSTILODVTLAGGTLSLSD 
GVTLQLHSFKQEASSTLTMSPGTTLLCSGDARVQNLH IL I EDTDNFVPVR I RA EDK DAL V 
SLEKLKVAFEAYWSWDFPQFKEAFTIPLLELIJ3PSFDSLLLGETTLERTOVTTENDAVR 
GFWSLSWEEYPPSLDKDRRITPTKKTVFLTWNPEITSTP 

CPn_0453 513156 516152 

pmp_13 -Polymorphic Outer Membrane Protein 

NCVLLYLFFYSLSLICRIIWFHLYVQMKTSIRKFLISTTLAPCFASTAFTVEVIMPSENF 
DGSSGK I FPYTTLSDPRGTLC I FSGDLYI ANLDNAI SRTSSSCFSNRAGALQ ILGKGGVF 
S F LN I RS SADGAAI S SVI TQNP ELC P LS FSGFSQMI FDNC ES LTSDTSASNV I P HASAI Y 
ATTPML FTNNDS I L FQYNRS AG FGAA I RGTS I T I ENTKK S LLFNGNGS I SNGGALTGSAA 
INL INNSAPVI FSTNATG IYGGAIYLTGGSMLTSGNLSGVLFVNNS SRSGGA I YANGNVT 
FSNNS D LTFONNTAS PQNS L PA PT P P PT P PAVT PLLG YGGA I FCT P P AT P P PTGVS LT I S 
GENSVTFLENIASEOGGALYGKKrSIDSNKSTIFLGNTAGKGGAIAIPESGELSLSANQG 
DILFNKNLS ITSGTFTRNSIHFGKDAKFATLGATQGYTLYFYDPITSDDLSAASAAATW 
VNPKASADGAYSGTIVFSGETLTATEAATPANATSTLNOKLELEGGTLALRNGATLNVHN 
FTQDEKSWIMDAGTTLATT^ANNTDGAITIJ^KLVINLDSLEXJTKAAV^V^ 
ISGTD^LVKNSQIXCONHGMFNKDIJSQVPILELKATSNT^/TTTDFSLGTNGYQQSPYGYQ 
GTWEFT I DTTTHT\TGNWKKTGYLPH PERLAPL I PNSLWANVI DLRAVSQASAADG EDVP 
GKQLSITGITNFFFANHTGDARSYRHMGGGYLINTYTRITPDAALSLGFGOLFTKSKDYL 
VGHGHSNVYFATVTSNITKSLFGSSRFFSGGTSRVTYSRSNEKVKTSYTKLPKGRCSWSN 
NCWLGELEGNLP I TLS SR I LNLKQ 1 1 PFVKAEVAYATHGG IQENTPEGR I FGHGHLLNVA 
VPVGVP FGKN3HNRPDFYT I IVAYAPDVYRHNPDCDTTLPINGATWTS TGNNLTRSTLLV 
QA33HT3VNDVLEI FGHCGCDI RRTSRQYTLD IGSKLRF 

CPn_0454 51ol79 519115 

pmp_l4 -PolymcM-'hLC Outer Membrane Protein 

GM P L3 FK3 3 . f J FC L'_\C LC 3 ASC AFACTR LOG 1 1 P/ P P I TNOG £ E I L LT3 D FVC 1 ' N F LGA3 F 
333F [ M 3 3 3 N L3 LLC K' J LS LT FT 3 CQ A PT NS f IY ALLS AA ET LT FKN F3 3 INFTCN03TGL 
(3GL rYGKD EVFQ3 :<DLrFTTNRVAY3PA3^yTT3ATPAXTTVTTGA3AL0r > TD3LTVENI 
303^KFFGN[^FO^^I3J3PTAWK^rN^rrATM3FSH^/FTSSC/y^V[YtX7;3LLFE^I^f3 

or i rrTAN3cw3; sg^t r 3 3c ;t y a u. i/jgg a i c i ptgt f el knnoc kc p n ) y Nt ;t p m dag 

A I YAET^N I VON(.V -\LLLD3NTAAPNTX lA rCAKVLNtOCPCP I EFSRNRAKW XI A IF [GP 

■ ; vo d pa KO r r rp lt : : A3 ?r, r i \ fognmlnt y pg r rna i t veagc : e r V3 L3 Ay k - 3. k r ;/ f y 

DP rTH3I J ETT3I'SNKO IT [N W .A: !03WFT3KGLSSTELLLPANTTT I Lli. ITVK [ A3G F. 
LK ITUlAWirJLGF \ T<j: Wi^l.TU !3f V rrL/JLATPTGAPAAVDFT L;KL-\rDf 'P*'FLKPO 
KV'JAV/NAl ;TKNVT:/rt;AlA'LUhHI;V , Pn[/^rjMV3L03PVArPIAVFK^A-!'V , i , K'[\;F['r^t: 
rATP3HYf;YC/;KW-YTW3Rri,L,IPAPUX;rH;GP3P3AflTLYAVV^J3[i , rLVR3'PYr[.tjpE 

r y ; e r v 3N3 r ,w r 3 r l, ; f jo a i *. : t j i uy >\t i , 1 , 1 1 a\ \ <; L3 [ t a kalg a yv kht p i >w ; 1 1 1 \ ; f -;g h 
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YGt WAAL: IMrrfTDHTTL^L^FGQLYGKTNANPYDSRCSEQMYLLSFFGQFP IVTQKSEA 
L I l^K/VVYr^YGKNHUm^i'LPPDKAPKSOOOWHNNSYTVLISAEHPFLWVCLLTRPLAQA 
WDL3GF [SAEKLOCWQSKFTETGDU3R5F5RGKGYNVSLP ICCSSQWFTPFKKAPSTLTI 
KLA YKPD I YR VNPHN I VTWSNQESTS TSGANLRRHGLFVQ I KDWDLTEDTQAFLNYTF 
DGKNGFTNHRV'JTGLKSTF 

OPn_0455 520363 519458 

No robust hnmnloq present in Genebank/EMBL as of 11/7/98 

v ;■■:•■[- w;:.pf;.: ,' * 'v r rrzYT'-r "ii f rrr; r v. tcl : czr/T-EGYrrfLNVV 

!•■■! :.:Mr:r,V/i i.u.rr.Ln'^ .m'-vt .M.i-rr.iirniRNr or /^F^rrv pekeaadv: 

AS GC K WAF DDEHLPWV3 3 H I AYAEE I R EKQEQTMQGS LT EEQLGALLCNTV STEKNLAF 
ALDA V I KQS VWR FRNP DLFAYEREAL EAS VTDALVS YVSNLDM I PYTS SQG I VI EDSS I V 
RTSQEHTL I VNCAAFDKLASQI EFLC PSDVLPI SGKDPLI SDDEDEELNPKVSSAADSKD 
KT 

CPn_0456 521568 520327 

No robust homolog present in Genebank/EMBL as of 11/7/98 

I PCTFESKRKFLMTHCLHC^SVVRHHFVQAFNFSRPLYSRITHFAJjGVIKAI p IVGHLV 

MGVEWLISHCFERGVSHPGFPSDIAPIIJ<VEKIAGRDHISRIENQLK5LRKTIEVEI)LX1K 

VHGQYQ ENPY ADMASS EVLKLDKGVHVSELGKAFS RVRNRITRSYS YAPTPQLDS I AIVG 

IDLVS PEEQENLVRLANEVIQLYPKSKTTLYLLI DFNKEWVGDI SSDKEKQLRSLGLHSE 

VQCLSVLEPQGAEGEDTKHFDLMVCX: YGKDSYLREGK I LQQAL^^ 

SRYRSRLSLP INT EKDKTELYKE I SRTHHQLH TLGMGLGAQDSGLLLDRQRLHAPLSQGS 
HCHSYLADLTHEELKILLFSAFVDAKNISKKELREVSLNFANDTSVECGCAFYF 

CPn_0457 523886 522120 

No robust homolog present m Genebank/EMBL as of 11/7/98 
VPLPSRVMASCLSAWFS IVREHFYRAFDFSLPFCARITEFVLGVIKG I PVVGHI IVGIEW 
LVS RY LES FVTKPTFVS DWSLLKTEKVAGRDH I ARWETLKRQRVAVAPEDEDKVHGKI 
P VH PFGG IQ PVEVLTLYPEVQDATLGLAF SK I RNRVRQAYLQAFRPKLQK IY I IGNDMNP 
FEVDDFLHLARLO^QRLYPDATISLYLTASGGRNAMDKKNRKIXSrc 
NC^DVVKQATCDCWMVYHGENDQGTLNQ I QEELEKSGEETPWIHVGQKPLSQSLWDFS PF 
SSLE^GDKEKALEYSELEKEQLYSRLVYVGERSSVLSLGrcDSRSGILKDPKRVHAPLS 
EGHYCHSYLADLENPGI^KTIIWUFLJfPKELSST^ 

MSRSDRNVVVWCDSWWGTDWKEEPSFQHFIMELECRGYSHFNIFAFRSNSMO^ERRIL 

NESSQEKAFTMIFCEDSVSQGDIRCLHIiASEGMLCGKECYAVDVYTSGCANFMM 

ER ESNLWNRKHGLWKREVRKQKQ EAALDQDE SE I YVCNQLTAQQNFAC S 

CPn_0458 526344 524236 

No robust homolog present m Genebank/EMBL as of 11/7/98 
YF &GCYLKLFVSNF I F FWMP I PY I S SWI STVRQHFVKAFDF SRPFCSRVTNFALGVIKA 
I £*VGH IVMGMEWLVSSCVAG IITRSSFTSDWQIVKTEKALGRDHISRVAEILQRERGT 
I T £>ENQ DKVHGK F P VC PFGRLK S EET LKLK PG EREGTLDTVF S P I RT RVTRAYLQ APRP E 
I Kfl S I VG S KLKT PQD F SQ FVS LANETORLH P EALVC LYLTG LNRESQMC DTTTAEXKQY 
LHSSG LDSRIQCKDSK EDDAG S PENP ELW I GYY SREQQHN I DGQY I QQC LGKS AD P I PW I 
HVTEDTKDFTYPPNFTSYSHTRQSTDPTSPPRLPESEGDKDSLYGQLSRSY^HEYMLGLG 
LK RED AG LLMD PDR I YAP L SQG HYCH SYLAD I ENEDLRT LVLS P FL DPGNLS S EDLRPVA 
FMIARLPLEIXiSLFFRLVAGQQEGRNIVTLAHGTPRPEDLDPDSMNILTRRL 
N^.SYKSRKMIVKERQFFGDRSEGKSFTLILFEDPISAADFR^ 

D^CASGCSCIQFSEMQSPQAIEYRCWEARVEDEAGEEAREPVIYSQDQLSSMLTTCX^^ 

FSLTDAWKQAIWRFRSKGLLTMERKALGEEFLTAIFS 

SEEELDRMVQVLPAEVPADSGNDPTRPVPNPDSNPDSSQNEGS 

CSn^|0459 527062 526619 

Nc?"robust homolog present in Genebank/EMBL as of 11/7/98 
STJKIQMH PGLRNWRTSTNKLREEGSVS FREYF RAYMCDK I VAQKNF LFTLDAVIKQAGWR 
SQEKLNLFYVESQALGREI KVSLEEY IQSMVG ILGSQRTKKSFKFSVDFTPLEQALQERC 
SSjS&DEDATATSTATGATASPTDMHEDE 

C£jh§046O 527840 526992 

No; robust homolog present m GeneDank/EMBL as of 11/7/98 
Vljj^LNFALEETPSISVQYQEQEKLSPCDHSPEIGKXK^ 

Y^LNI^IQNSLSGWI^DPYRVCAPLSSPYSCPSYLLDLQNKELRRSLLSTFLDPKNLTSE 
TFRSVS INFGNSSFGQRWSEFLSRVLHDEKEKHVAWCNDAKLLEEGLS PEALSLLEEDL 
RE,S©YSYLNILSVSPEGVSKVQERQILRRDI^GRSFTVMITDLPLGSEDIRSLQLASDRI 
LVS^SLDAADACASGCKVLVYENPNASWAQELENFYKQVERRR 

CPf©)4 61 528647 527844 

No robust homolog present m Genebank/EMBL as of 11/7/98 
IS IVAC PS I SSWFTWRQHFVNAFDFTHPVCSRITNFALGI IKAIPVLGHIVMGIEWLI S 
WI PRHTVRHGMFTSDVSSAIKVECTRGHNCLAPLEAYLaSLRVPISQEDLGKVHGRTPED 
PFVDITPTEIVQLLPDEELSTVDEALCGVRSRLTYAYRSVEKPMIQDLALVGFGLRDSAD 
L I NFVR LANGVQNHYPHTKVKLYLAKNIADVWDCE I S EEEKGQLRALGLD PKIESISLTS 
AG LP SV PEV ATVDFM I TCYGKDQ EVQDP 

CPn_0462 531124 529037 

No robust homolog present in Genebank/EMBL as of 11/7/98 
LI FYLFLNLYIACVRFHFQCWFDPMACYIS IWISTVKQHFIRAFDFTRPLGSRITNFALG 
VI KAIPILGCWICVSWLVSTCSARRFGKPAFTSDVAS IVKIEKTRGYNPLAWVEQYLRQ 
LRVRLPEGDLGKI HGKVSRDYVCDRTPQENLNMVPHQYLGELGRAFYG I RNRVTKAYQRV 
T P LEVPC LT LVG F D I LDP EDQVNFVR LANG I QTQ Y PQTOI KL YL I S IQK I WNQC DGT I SQ 
E K EQQLRS LG LDAK I KCVS A PALLLQ KYLQ S ENL PSCDLL INYYGKQQSVRDVDS I KSL L 
N L33 EH I PA I SVTY RPDDPFYSYYFFPGS QGGT APDQR I PWS EQ EH LQTYTT LSNPRCDR 
YAVHLGMEDFASGVFLDPLRVSAPLSGEYSCPSYLLDLKSEELRCFLLSAFIDPNNSGQG 
NPRPMSINFGNSPLGQRWSEFLSRVLHDETEKHVAWCNNPQLIKKSFPSHSLSLLENEL 
EEGGYSYLNIVSV3QERTCVKERRILSSDPSGRSFTVILTDLPEGSSDIRNLQLASDRIL 
VSSALDAADACASECK I LEYEDPEQEWAQQYASFYPNIDRAGDLQRQG I PGEPLGVSAST 
RWLEKDrVFNLNAVTQQAMWKFKKRDLFAVESQALGDDMRRALEGYIGSSLLVEGTIQP 
OVACNVNVnFATLDEAVCAACDSAQDAPSEENNTDD 

■':['[._(Mb3 '> 12491) 53U<>1 

Ho robust homo Lot; present in Genebank/EMBL js of L 1/7/98 

f pyekteollcitpncrtprvn istvg i pidetcnafvdsmmkqgvoqdakelytflsr 

':NKMYOEi:LWr-V;iJirXljJFLFDCKMLCAPLSEDHYCMSYLVDLVDQHLKDLILSMFLDPQ 
HI ::A/Jfr:M,KV:: I NVf;rj;:r;'PLuQKDFL;:MVLRDETCJKNWWFKc:VLSLPATQVCKLVEE 
f .M:;KDY:;Y(,N[ F::CHfiDSS PQLLFRKEr<ECT3GRYFTV[CALY[^^DTDMRoLQ LASER IM 
V.' ; I i f ■: F D [ . V I )A Y AA Ft ( ; K [ . L K E DHTNW R PG T FS R H AD F A DA VDV 'J AC F NS R E F K L I TQANQG 
i LR::t Jtlt.t J LP:jKT[-'WROFt J AFCDRVTVTRHF I PMLDAA I KQAW/THKHF3L I DKECEALD 
l.riV.'I.I'.'MVSYl.lCVVTNSMEKT^KGPFtOKEI I ADOOPLKEALFPGSDEDVPSTSEDPS 

ijiJiir-niJi-i-D:: 



CPn_fMS4 5J2123 S \2 

No robust nomoLoq present m GenebatiK/ EMRL i~ ^ 1L 7 ,r >8 
3LETRGRFTEECLQLLFFDIQ5LKFLQLF3EGTALNLFR [FAPLRNRVTTEYSRARQPDL 
HRIAIVYIGVLDSESSKELEFLISYMSC rYoESQI^LRFF^KNVNCSAVLSKLHVENLH 
IRCGFF3EDAVPE3EPFDL3IYVHTDRSCPLPTKKR3SSWELCTVELPE5 rYPQSEFLLM 
RPRMLS 

CPn_0465 53 327^ 1 

. i - i t r-,f j ■- :^ '•. t 'w i -'■ :i'7 i« 

ALi-t II(JvL/VOLl^LwK Ir'Kt'Lri'JDVU^iVkVLtf; vwttLMKjRVEDILKRQRLoLEt 
RDEGKVHGDLPSAPFF 

CPn_0466 533713 536537 

pmp_l 5 -Polymorphic Outer Membrane Protein 

TS^FFCFGMLLPFTFVLANEGLQLPLCTYITLSPEYQAAPQV^ 

DFILDYKYYRSNGGALTCKNLL ISENIGNVFFEKNVC PNSGGAIYAAQNCT I SKNQNYAF 
TTNLVSDNPTATAGSLLGGALFAINC S ITNNLGQGT FVDNLALNKGGALYTETNLS IKDN 
KGPIIIKQNRALNSDSLGGGIYSGNSLNIEGNSGAIQITSNSSGSGGGIFSTQTLTISSN 
KKLrEISENSAFAN^^I r GSNFNPGGGGLTTTFCTII^^lEGVLFNNNQSQS^JGGAIHAKSI 
1 1 KENGPVY FLNNTATRGGALLNLSAGSGNGS F ILSADNGD 1 1 FNNNTASKHALNPPYRN 
AIHSTPNMNLQIGARPGYRVLFYDPIEHELPSSFPILFNFETGHTGTVX.FSGEHVHQNFT 
DEMNFFSYLRNTSELR.CGVI^VEDGAGLACYKFFQRGGTLLI/X^ 

PTTVGSTITLNHIAIDLPSILSFQAQAPKIWIYPTKTGSTYTEDSNPTITISGTLTLRNS 
NNEDPYDSLDLSHSLEKVPLLY I VDVAAQK INSSQLDLSTLNSGEHYGYQG IWSTYWVET 
TT I TN PTSL LG ANTKH KLL YANWS PLG YR P H P ERRG EF ITNALWQSAYTALAGLHSLSSW 
DEEKGHAASLQGIGLLVHQKDKhKSFKGFRSHMTGYSATTEATSSQSPNFSLGFAQFFSKA 
KEHESQNSTSSHHYFSGMCrEOTLFKEWIRLSVSIAYMFTSEHTHTMYQGLLEGNSQGSF 
HNHTIAGALSCVFLPQPHGESLQIYPFITALAIRGNLAAFQESGDHAREFSLHRPLTDVS 
LPVGIRASWKNHHRVPLVWLTEISYRSTLYRQDPELHSKLLI SQGTWTTQAT PVTYNALG 
IKVKNTMQVFPKVTLSLDYSADISSSTLSHYLNVASRMRF 

CPn_0467 536528 539434 

pmp_16 -Polymorphic Outer Membrane Protein 

NET LT I SDQNRKIKEPLVS KT P PKFLFYLGNFTACMFGMT P AVYSLQTDSLEKFALERDE 

EFRTSFPIiDSLSTLTGFSPIT^FVGNRHNSSQDIVLSNYKSIDNIIJLX*WTSAGGAVSCN 

NFIXSNVFXiHAFFSKNTAIGTGGAIACQGACTITKNRGPLIFFSNRGLNN^ 

AIACNGDFT I SQNCGTFYFVNNSVN7^>^ALST>*3HCR I QSNRAPL^FFNOT 

RSENTTISD^^^RPIYFKNNCGN^K3GAIQ/^SVWAIKNNSGSVIFNNNT 

GGAIYTTNLS IDDNPGT ILFNNNYC IRDGGAICTQFLTI KNSGHVYFTNNCJGNWGGALML 

LQDSTCIXFAEC<^IAFQNNEVFLTTFGRYNAIHCTPNSNI^IjGANKGYT^ 

HPTTNPLIFNPNA^QGTILFSSAYIPEASDYENNFISSSKNTSELRNGVI^IEDRAGWQ 

FYKFTQKGG I LKLGHAAS I ATTANSET PSTSVGSQV 1 1 NNLA INLP S I LAKGKAPTLW I R 

PI^SSAPFTEDNNPTITLSGPLTLLNEEIJRDPYDSIDLSEPLQNIHLLSLSDVTARHINT 

DNF H P ESLNAT EHYGY QG I WS P YVA/ET ITTTNNAS I ETANTLYRALYANWTPLGYKVNPE 

YCGDI^TTPLWQSFHTMFSI^SY^TGDSDIERPFLEICGIADGLFVHQNSIPGAPGFR 

IQSTGYSLQASSETSLHQK ISLGFAQ FFTRTKE IGSSNNVSAHNTVSSLYVELPWFQEAF 

ATSTVIAYGYGDHHI^SLHPSHQEQAEGTCYSHTIJLAAIGCSFPV^KSYIJiLSPFVQA^ 

AI RSHQTAFEEIGDNPRKFVSQKPFYNLTLPLG IQGKWQSKFHVPTEWTLELSYQPVLYQ 

QNPQ IGVTL LASGGSWD I LGHNYVRNALG YKVHNQTALFRSLDLFLDYQGSVSSSTSTHH 

LQAGSTLKF 

CPn_0468 539608 540432 

pmp_l 7 -Polymorphic Outer Membrane Protein 

IYKLLDNKLMI FYDKLYFH IKVWMFMRP I CLS I LSTALCC SLSGNEVPNLASCQMS RKD I 
SAFHTSPSFRIJ^PEPLVSSFRPSNLLNGFGHDITQDITITGNSINSVrDYNYHyEDGG 
I LAC KNLF I S ENKGNL SFERNSSHS SGGALY S VR EC W I S KNQNY S F I SN AAS LATTTTSG 
FGGAIHALDSY ITNNLGEGQFLDNVSKNRGGAIYVGVSLS ITDNLGP I VIKKNQTLEDSS 
FGGGIFCRAVNIERNYQNIQ INDNASGQGWYFLP 

CPn_0469 540399 541460 

pmp_l 7 -Polymorphic Outer Membrane Protein (Frame-shift with 

0469) 

CFRTRGGIFSALGVI ISSNKEI IEISNHSASSI^r^ASGKLYPGGGGIMCTSLVIE^lNPKG 
L I FNNKTAALSGGAIHTRS F IFQNNGPTAF INNSATSGGALINLSG IGSTPQNFFLSADY 
GDI LFNNNT ITS S S PQ PGYRNALYAA PG I NLKLGARQGYK I LFYDP IDHDQTTTDP I VFN 
Y E PHHLGTVLFS G INVDSNATNP LNF LSKFSNS SRLERG VLA I EDRAA I SC KT LSQTGG I 
LRLGNAALIRTKGPGSSINFNAIAINLPSILQSEASAPKFWIYPTLTGSTYSEDTSSTIT 
LSGPLTFLNDENENPYDSLDLSEPRKDIPPPLPPRCDCKKNRYFESHCRSHELR 

CPn_0470 541357 542532 

pmp_l 7 -Polymorphic Outer Membrane Protein (Frame-shift with 
0470) 

I S LNLER I S PLLYLLDVTAKK I DTSNL I VEAMNLDEHYGYQG IWS PYWMETTTTTSSTVP 
EQTNTNHRQLYVDWTPVGYRPNPERHGEF IANTLWQSAYNALLG I R ILPPQNLKEHDLEA 
SLCX3LGLLINQHNREGRKGFRNHTTGYAATTSAKTAARHSFSLGFAQMFSKTRERQSPST 
TSSHNYFAGLRFD5LLFRDFISTGLSLGYSYGDHHMLCHYTEILKGSSKAFFNNHTLVAS 
LDCTFLPARITRTLELQPFISAIALRCSQASFQETGDHIRKFHPKHPLTDLSSPIGFRSE 
WXTSHHIPMLWTTEISWPTLYRKNPEMFTTLLISNGTWTTQATPVSYNSVAAKIKNTSQ 
LFSRVTLSLDYSACVSSSTVGQYLKAESHCTF 

CPn_0471 542561 545401 

pmp_i 8 -Polymorphic Outer Membrane Protein 

TVQNNRSLS KS S FFVGAL I LGKTT I LLNATPLSDYFDNQANQLTTL F PL I DTLTNMTPYS 
HRATLFGVRDDTNQDI VLDHONS I ESWFENFSQDGGALSCKSLAITNTKNQI LFLNSFAI 
KRAGAMYVNGNFDLSENHGS I IFSGNLSFPNASNFADTCTGGAVLCSKNVTISKNQGTAY 
F I NNKAKSSGGA I CAA 1 1 N I KDNTGPCLF FNNAAGGTAGGALFANACR I ENNSQP I YFLN 
NQ SGLGGA I R VHQEO I LTKNTG ^ V I FNNN FAM EAD I S AN H S SGG A I YC I S CS I KDN PG I A 
AFDNNTAARDGGAICTOfiLT IQDfJGPVYFTNNQGTWGGA rMLRQDT,ACTLFADQGDI I FY 
NNRH FKDTF.^MH V^VNCTRNVj; LTVGA.^QG] \V> ATFYDP I LQRYT tQNS IQKFNPNPEHLG 
T I LFSSTY I PDT'-T^RDDF [flHFRNH [GLYNGTLALEDRAEWKVYKFDQFGGTLRLGSRA 
VFTTTDnEQ: M *\\ : J V [ N [ NN LA INLF'I T U'-NRVAPKLW I R PPOttJAP Y3EDNNP I ENL 
•Icr'L'lLI.DDENr.DrYDTADLAof' rAEVPLLYLLDVTAKII [NTDNFYPr^'LNTTQIiYGYQG 
VW'^'YWrFTnT^PT ; ;EDT\-NTLHROLYf ;dWTPTGYKVNPENKGD rAL:'AFViQ:JFHNLF 

a* r lr yq' r{x>i ',q r a i t a * v \ v.\v\< i , f vh fjw; nn [ ja kg Fi t m eatg y s l* iTt: j nta; ;nh ; * fgvn 
F f ;o^^:;N[ J Y^::Ml^pNJyAJ;!^r^™Lor^JNl^LQERF^'T^;ASL^Y::Y:':^JilHrKA:x^Y [ :GK 

[0'I , nGK( , YST'l , Li*.\AL:7*:J[ J :"[ J OWP:'pf '[,| ir*rr'F rOA TAVR.'-JNUTAFOE^CDKARKFwVH 

K[-L/NLTVPUjiL v uwF.r;Kni[ jTYwri£i:i^\YypVLYL/jNPEiNV:ux.:*'xr::^ 
IjAI'NA I Ai'KiIRNQ T T [ TPKI .:VF[,LjY'> ;:;v:;:J':'rrrMYLiiAG't'TFKF 

iT'ti_iJ4/^ M )M 'i4'>'jHi 
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No robust hnmoloq prpsent in Genebank/EMBL ds of 11/7/98 

evfma:;g rr jg.'^jglgk i ppkdtjcdrsrspspkgelgsheislppqehceegasgsshihs 

3S3FLPEDQE:;QSSSSAASSPGFFSRVRSGVDPALKSFGNFFSAEST3QARETRQAFVRL 
3KTITADERRDVDSSSAAATEARVAEDASVSGENPS0GVPETSSGPEPQRLFSLPSVKKQ 
oGLGRLVQTVRDRIVLPSGAPPTDSEPLSLV'ELNfLRLSSLROELSDIQSNDQLTPEEKAE 
ATVTIQQL IQ ITEFQCGYMEAT0SSVSLAEARFKGVETSDEINSLC3ELTDPELQELMSD 
GDSLQNLLDETADDLEAALSHTRLSFSLDDNPTPIDNNPTLISQEEPIYEEIGGAADPQR 
TRENWSTRLWNQ TREALVSLLGMI LS ILGS I LHRLR I ARHAAAEAVGRCCTCRGEECTSS 
■■'.I'ti'.'M:'.':- '.:'] • "{",'■ ' l "VVT'T ' "PITVPiT-T! * " r " ' .M* < A I ' ! f IW/- r ' "^ r'TT ' ' T". 

■n-F.i;-' : \i ;w .m).;.;'.-" f/MnDU'T-',/^ • n»vr , .r'pw*p\p~[.-.CDV 

KGDYiiVPITJAEPSKDKNIYMTPRLATPAIYDLPoRPGSCGSSRoPSjDRVRSSSPNRRG 
VPLP PVPSPAMSEEGS I YEDMSGASGAGESDYEDMSRSPSPRGDLDEP I YANTPEDNPFT 
ORNI DR ILQERSGGASAS PVEP I YDE I PWI HGRPPATLPRPENTLTNVSLRVS PGFGPEV 
RAALL3ESVSAVMVEAES IVPPTEPGDGESEYLEPLGGLVATTKILLQKGWPRGESNA 

CPn_0473 549602 548070 

No robust homo log present in Genebank/EMBL as of 11/7/98 

GSIMAVGGVGGSRSPSPIPP^RNSECGKVSPKDhn^GEHTVSSSDSSLASCGPTIEERKA 

QLGGTDKIPLPSVKEPGDSQTSGRSGVLQRIWKGVKGVFKKTPQARPEVSSPRLPSHVOH 

GORLPGLEGFRDRIQKRSENPEADLGKMKRSYSDGDLDRVGHDSNEDSTEDSRSEGGEPS 

SKSSSFLSGVRGAVSKVHGALGDIKGKFQRSASEDDLTTGGEDSAGDTVKERRSEEAEAS 

SKSSSFLSGVRGATSTVQGALGDAKEKVSAFGEQAAGAIRSAPGNIRTRFQRSSSEGDLS 

NVNKAAKHLRKALENLEKVAPEQVSPEVASRVQSIXARMEQLTHQEPPTVEDLITFVESN 

VGSDSVEYAS rVPOrX3SQAPAETA£A?ETGGVEGSAAQGAWKALRDFVVS I FQAVASFFR 

AIASRLSSARRESAVDDLASESOTQWFVEQEGVSNPSAAPSLSFAEEIARRAAEMSNRNA 

QSLEKLESGNVTD PVIOQGLGLARSFAPEGQ 

CPn_Q474 551600 549807 

CT365 hypothetical protein 

LKI I IS ISFMSTSPrSNDPRYLSLSNATEKTSLLANSRSLSPVPNSLVPSNPEDTGLRKS 
IFTHSVTLFAGLWLLVAVSVVWALTVIAPGVP^ 

VRDYMSPRMOESSRIKSAIAVGTGFTVMGLVMKVGANFVPGGYGGLVGSLGSSAYSRGSQ 
TTIJVSFSHYIYTKFFRSEKVAKGEKLTEAErriKZAKKIJiYITLSIA 
IAGTVLLGGAPAT IAI ILAPPLI S IGLTTVLQTILHSS IGKWRAFLLTQEKKDLFVDTSL 
KDIRLEKLPPSEVEESETSQSVIEVPDSEGIAETRISAEEIDTRLSLTTRQKV'IFALATL 
LLLAS IAAF I VTG FGGLTVMQVLL VASVG S A V ASVTL PMVS SG FSYVAYQLKARLN I SKL 
RWKEAKNKK RVRQFL I ESG VI ASDREFNQMWKTVYKKQ I QKTDAAI REEVRNFEKGGEVN 
SALVGG I LLGVGTG IMLLALVPAFAP IVPG I LALGGSTLG I AG S ILMRKFVNWLYDELVK 
LYERRRNRRELLYGPESKMRS I ATDLWEALAASHDHLF DLDG PVDF I DVDVDIDGAA 

CPn_0475 553850 551685 

gligSz-Glucan Branching Enzyme 

ps**vdklihpwdldllvsgrqkdphkllg i lasedssdh ivi frpgahtvai ellgelhh 
amayssglfflsvpkgighgdyrvyhongllahdpyafpplwgeidsflfhrgthyriye 
rmgjvi pmevqg i sgvlfvlwaphaqrvswgdfnfwhglwplrk isdqg iwelfvpglg 
eq^gykweivtqsgnvivktdpygksfdpppcxn'arvadsesyswsdhrwmerrskqseg 
p\ctiye^lgswqwqegrplsysemahrlasyckemhythvellp itehplneswgyqvt 
gy^ftsrygtlqefqyfvdylhkenigiildwvpghfpvdafaiasfix;eplyeytghs 
qalhphwnt ftfdysrhevtnf llgs alfwldkmh i dglrvdavasmlyrdygredgewt 
PN&YCGKENLES i eflkhlnsvihke fsgvlt faee STAF PGVTKDVDQGGLGFDYKWNL 
GWMHDTFHYFMKDPMYRKYHQKDLTFSLVTC^ 

WTRF AQMRVLL S YQ I C L PG KKLL FMGG EFGQYGEWSP DRPIJ!)WELIJ^HHYHKTLRNCVSA 
IJ^^^IHQPYLWMQESSQECFHWVDFHDrEWWIAYYRFAGSNRSSALLCVHHFSASTFP 
SYi^CEXJVKflCELLrJTCDDESFGGSGKGNRAPWCQO 

FF|f| 

CPn_0476 554877 553858 

CT865 hypothetical protein 

GRGR&ADWGDCMIDIMQHFKPYTMVPGQKLPIPGSLLYAQ\/^ 

LQVQpPLKRFAVFQDLHRGGLAVTSERYKYYLLPSG ECTQS IKGKLPSAAQAGPLLSLGV 
HKJiliffiWQKVRCRRDLKEILPLWFRFAAMAPKGSYRCLETTAIGSLVKTAHQRVIjHRETTE 
IAPALLSIALAGFSECFLPRSYDEEFQGILPQDGDPEGGVPFELLSYSFGMIQDIFLRHQ 
GQflSV^I LPALPPEFPCGRLIHVALPNLGTLS IVWTKKT I RQVELHAEYSGEVFLKFCSSL 
CSARLREWSEPJ^LSGSKRLSLGETIjEIKAGTTYLWDCFHK 

CPni^477 556112 554844 

*ycfe^_Bs Hypothetical Protein 

RYEff^AE^XGTFKLVCLGCRVNQYEVQAYRDQLTILGYQEVLDSEI PADLC I INTCAVTA 
SA^SSGRHAVRQLCRQNPTAHIWTGCLGESDKEFFASLDRQCTLVSNKEKSRI.IEKIFS 
YDTT FPEFKIHS FEGKSRAF I KVQDGCNS FC S YC 1 1 PYLRGRSVSR P AEK ILAE I AGWD 
QGYREWIAG INVGDYCDGERSLASLI EQVDRI PGI ERIRI SS IDPDDITEDLHRAITSS 
RHTCPSSHLVU3SGSNSILKRMNRKYSRGDFLDCVEKFRASDPRYAFTTDVIVGFPGESD 
QDFEDTLRI IEDVGFI KVHSFPFSARRRTKAYTFDNQI PNQVIYERKKYLAEVAKRVGQK 
EMMKRLGETTEVLVEKVTGQVATGHS PYFEKVS F PWGTVA I NTLVSVRLDRVEEEGLIG 
EIV 

CPn_0478 557640 556210 

h£lx-GTP Binding Protein 

WHGGPLDTIDTPGEQGSQSFGNSLGARFDLPRKEQDPSOALAVASYQNKTDSQVVEEHLD 
ELISLADSCGISVLETRSWILKTPSASTYINVGKLEEIEEILKEFPSIGTLIIDEEITPS 
QQRNLEKRLGLVVLDRTELILE IFSSRALTAEAN IQVQLAQARYLLPRLKRLWGHLSROK 
SGGGSGGFVKGEGEKQIELDRRMVRERIHKLSAQLKAVIKQRAERRKVXSRRGIPTFALI 
GYTNSGKSTLLNLLTAADTYVEDKLFATLDPKTRKCVLPGGRHVLLTDTVGFIRKLPHTL 
VAAFKSTLEAAFHEDVLLJHVVDASHPLALEHVQTTYDLFQELKIEKPRI ITVLNKVDRLP 
QGSIPMKLRLLSPLPVLISAKTGEGIQNLLSLMTEI IQEKSLHVTLNFPYTEYGKFTELC 
DA6WASSRYQEDFLWEAYLPKELQKKFRPFISWFPEDCGDDEGRGPVLESSFGD 

GPn_047 9 558434 55761b 

phnP-Mfer.dL Dependent Hydrolase 

AItiMVRDIQf3ESIGKLVFLGTGNPEG IPVPFCSCRVCQNTGIHRLRSSVLIQYQNKTLVI 
OAGPDFRTQMLVAGVSELDGVFLTHPHYDH I GO IDDLRAWY I VTQRSLPLVLCA3TYRFL 
NKAKEYLFATPNVESSLPAVLEFTILNEDCCOEEFC/JrPYTYV^YYOKSCHVTGFRFnNL 
AYLTDLC:;VDAKIFf:YLDNVETLILSAGP:;CTPIPF(>;HK^CHLTVEnAKAFANHAGIKN 
I A CTII [-niCLEAERDOMPEVTFAYrXJMEVLWTL 

<:Pn_(MHf) l >Vn7 r > r > r >Rt.50 

'"Vim liypof.hf't L< a L pro rem 

: ; i : ; w ; i : ;< ; h lcj f.fv: ; k k eq dc ;m lc y. > r , pc y r\ w Jt ; [ e cy k n r \ r ycq lc a cws p yw 

l 'V [ WDVf jr I A P PV I LQV I SiCKQHK 1 1. t'VHG P I T'J LWAL C PVG '.A POLE:: AMY ELC3 

^vhni'dl'::; rvswvR ;gu- r faglivgvmveaplt agljawv [ rr r kxivga elclfall 

MAYU'.ltf iPVRKWL.NLSHEY rTQCHORQ [QAHrJQNY';7 ETFYPATCALlIQP [TKLPNGSRR 



DN 

CPn_048l 5^.^'*7l SS^M'i 

No robust homo log present in Genebank/EMBL -is ot U/7'9fl 
3CLRTEG rLMATSVPVTSST3VGEANSSNERFTERT3RMYYAALVLGAL3CL IFIAMIVt 
F PQVGLWAWLGFALGCLLU3LAI VFAVSGLVLGKTLEPSREATP PE I VAQK EWTTQQDV 
EjGNEYWRSELISLFLRGDLHE.'j L I VDSKDRSLDI DQ5LQN I LKLEPL3TTL3LLKKDCVH 
tMIILHLVRQWNLLCT/DLSPE'/TAHAEELLLFLIEEOYi'GPC LLKLIRYGDALQATSPLM 

. : , ; .hi t . v\>-\.. . • f * nam -.owT'T-vv :"kz 

i-LJAN^^KLONLi- I Ar j J E- "jM i\ f Li uK t ^V'v KVbKHU 1 RLvlLSNTEILEiNEJ^rLCS 
LYEYPLSYLIDWA\*^LIX^PGTEISLEDQADYTVCLQGLDSMLSQFASRLQSGQKVLNPR 
DVL S EQ AA VM LVH G L-^AOG V 3 FOG LKALMY LT AV PQ RMWLG ALPLFESF P VTNRMK EFLG 
ESLGD 

CPn_0482 561764 560961 

artJ-Argimne Peripiasmic Binding Protein 

NLAYRAQTFMIKQIGRFFRAF I FIMPLSLTSC ESK I DRNR I WIVGTNATYPPFEYVDAQG 
EWGFD IDLAKA I S EKIjGKQLEVREF AFDAL I LNLKKHR I DA I LAGMS I T PS RQKE I ALL 
PYYGDEVQEI>!WSKRSLETFVLPLTQYSSVAV(yrGTFQEHYLLSQPGICVRSFDSTLEV 
IMEVRYGKS PVAVLEPSVG R-/VLKDF PNL VAT RLELPPECVATLGCG LGVAKDRPEE I QT I 
QQAITDLKSEGVIQSLTKKWQLSEVAYE 

CPn_0483 561330 564964 

No robust homolog present m Genebank/EMBL as of 11/7/98 

1 1 L I KKRAI FERMFP I PP PHC P PNNKNNFYHLTTDTK D PLLLR I LRT IG YVLLH 1 1 TLG L 

LLLIHYYKHHRVVRKEGLPTPPTLPKGPEPKT IE I AKQP PKDGEDKKPDVPK PGT PPPED 

TPPPPPKAPSPASPKVPKQPADKKPTPPPEAPPPPVRVATPMPIJ^PSSC?GYWQCLNRMVS 

M\^jy^PLPLPAMQVDPILGDFNPHFVASYPNRIDNEPKYFQIKQFKKIAQNPDLPQOHR 

RLAQLS L EQ AL Y LNDNYYLVNV PG DG NC FYRAY A VGWL S ALY E ES S RND I VF EQ EATR LL 

DLPFAS5SPANAN1XAE3-1AFT,:,QLCSTYCSFIDLYDGVIL$QKHTATLIAFLRKLSAYAI 

RMIAASSNEETARALFISI^DDLLPSVLEFIjAANRPYSELFQNLIDHSALPYMQSRDK 

LFLLLEHLPALFLTDAEIjQKMSPEX^IjRKQYE^EIREAFAKLSP^ 

VKDHLPEAIRCQYSRFIATIENRRSGDLPWSPALSFFAFlJrTCPSVRFHKLCATFYKSLE 

DI 1 1 ASAPPQRS iqeilqi snaslsylnedldsswqrevissnimt iltthesltlessm 

FQLETLHKRIAm^JOWrSTSFETPPLSNQPDI^SNLVNKLLVAIHSK^ 

ARS LRLTRD EGSGLS Q EQ DLLYTQ AVQLLF F I LQH PQVNNRP ETKD AVK ELKMLLL PF LQ 

YAFKKVENEKKLQKI^RSILGSLVLKPPARYPSTPS^fiCDKETFCKFWSRHPEVMVLDPIL 

EKIO^FXJ^TFPNYQLETEAILLEKEIESTFRhKMJWLTRLNLFGSKLGSPSSPTALS 

I>2FSKSFLIFCFLMJYPKLLQKKTPLAARIjDAFQREASHRFTQVKDKLLLSLKYGFPLAT 

ATINQYSRAR1X}LICNIJJ?NTVTASIX3FCRSGFRQSLIGYI^SLSSNELGDII^D^ 

EANDVAAMTTVPI^PFAVCLIMSDPX^TVSEENIEITFVAMHGFI^^ 

NHYGCLLPRNPRTEDQNSKPDSSNP 

CPn_0484 564931 565824 

aroG-Deoxyheptonate Aldolase 

RSELKTGQLKSLVLHEVLILTFTYPLPRTLKQHPDFA^HTVPISPNLSFGEGSPILIAGPC 
TL ESYE HTVS S ALTVKEAG AQVF RGSIRXPRTSPFS FQG WEK ECVLWH K EAQ S I HGL PT E 
TEVLDVRDVE ITAEHVD I LR IGAKNMHNT PLLQEVS KSHR P 1 1 LKRS PAATL EEWLCAAE 
YILASS PSCPGVILCERG I RTFEHSTRYTLDLNTVALLKEIS SLPVIVDPSHAAGKRSLV 
LPLASAGLSVGADGLMIEVHAHPEKAIXDAKQQITPEELHLFAKKHFCPSESRAHAIS 

CPn_0485 565993 566229 

CT382.1 hypothetical protein 

QPIGRTPTRVFLWRFMIKQACKFYLLQCLLCALYWLLKYCRKLLKGTLKHSEETLYQALL 
SSL IDLL YQLKQLPAPTNE 

CPn_0486 567799 566405 

hypothetical proline permease 
AQHRSLLKGNI FHLGCGVLYFMNFSLFLFFL I AIQG ICLYVG RRGS KKVEDRESYFLAGR 
SLK I F PLMMT F I ATQ I GGGVLLGAAE EAFCYGYGG I LY P LGVALGL I FLGMG PGKRLAEG 
SLTTWS I FEVFYGSKKLRK I AFLLSAGSLFFI LVAQVI ALDRLFSSFPFGKYVTVAFWI 
VLASYTSTGGFRGWRTDV IQAGF LL I AVLVCGVSVWLS VPKSLSVLDPFQS LPCAKLSN 
W I FM PMLFMLVEQDMVQRCVAASS PKRLQWAAVGAGLVL LLFNF I PLFLGSLGAKAGLKA 
GCPLIDTIAYFCNPSLAAVMAAAIGVAILSTADSLMNAVSQLIAEEYPTLKAPYYRYLVL 
GLAVAAPLVA IG FTNI VDVL ILS YS LSVCCLS VPVG FYL LAP KGRRVSGAAAWAGVLVGA 
LGYGWVQIVSLGMFGELLAWVGSLVAFSFVGF I EITWKNKVKTQT 

CPn_0487 56^833 568112 

CT384 hypothetical protein 

RRTGGISLTYSSFRWASFRCYSLIFFCFCGSLFGSESLRYQLLIQDFAKVSEEGIGLLES 
KEYSLLQAKLVLRALAQNSSFDCWFRSFKKCQ I SYPELAHDRDVLEEFGIQVLREG I ENP 
S VTVRA VS VLA I GLARDFRLVP LLLQSCNDDS A I VRSLALQVAVNYGS ES LKKA I VELAR 
NDDSIHVRITAYQWALLQIEELLPFLRERAENKLVDSVERREAWKACLELSSQFLETGV 
AKDDIDQALFTCEVLRNGMLPETTEIFTELLSVEHPEVQESLLLSALAWSHQLQNHKEFL 
SKVRHVMCTSPFAKVRFQAA.^LLHLHGDPLGRD3LVEGLRSPQPLVCEAASAALCSLGIH 
GVP LAK EHLESLSSR KAAANLS I LLL VS R ED I ERAG DV I ARY LSN P EMCWA I EYFLWDAQ 
WNLRGDTF P LY S E>M I K R E I G RK L I RLLAVAR Y 3QAKAVTATF LSGQQAQGWS F FSGM FWE 
EGDVKT3 EDLVTDAC F.AAK LEG ALAS LCQKKDQASLQRVSQL YNDSRWQDKLAI LESVAF 
SENLDAVPFLLDCCHHEAPSLRSAAAGALFS I FK 

CPn_0488 570147 569767 

hitA-HTT Family Hydtolase 

RKLPTCFAVjgVTRSRDHMTVFKQIIDGLIDCEfO/FENENFIAIKDRFPQAPVHLLIIPKK 
P I PRFQD I PGDEM I LMAEAGK T VQELAAE FG r ADGYRW TNNGA EGGQAVFH LH I HLLGG 
RPLGAIA 

CPn_04«'J 5~L0"J7 5700'i»- 

tm87 hyporhf-r i. a 1 pLOtoin 

R E VF A E ENYFnLVTMCLAyfr .RR l VGMQ T PRS tGTI IDG3 PI \ ADEVTA( > ALL I t FDLVDENK I 
ERSRDPWL.^KCFyyf DV(Vr\T^ E ENKRFDHHOVnYDf S'lWf-^lAt 1M [ LHY LKEFGYMEX2EE 

YHFi.wjTLvnf;vDr:on^;RFr,'KEGFc:;r^D[ i y r ynppeeeetn.suadfi'calnftedf 

LC f < L I' K K Ff J Y D F< V< " Hi, ; I V R F ET f-'DMf * L Y F L P I- J LAWO ENF F F I a ^"J KK M P AA FVC f P.^C D 
OW I LKJ r p E ' N L DR R M [ ) VP V P L FA MAC ;i , [ ,r, K C 1 /A K VSG r PG A V rc ' i ( Kr ; r , F 1 .1 : VWTN H ECC 
OKALHLTLOOP'.E I 

('PnJM'Jf) 12 /K '.Mi:; 

CT JH7 hyporh<-t u iL [Moroni 

lMYNLLIIAMHrJAA::nx;i'I.V::f!r^KL:;PH tYf / ;EVL I i^H PAYFE/^f'ULE W [OVNLK:: 
:;[,AuL/;VPJ\V[JMHr,i:i.NKAItKI- AI'LHVI.rM.yjM'IATAMLELLEl* ;:;rVt'Kr.KAAI)|)RHI. 

Vii::[f'Y[.NPMPnrrnR'tv;::n,;,PFr;KK[j:iiriTi,^E UiltPLWFiA'iiAWii yi irviv; 
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F L P LM; ; K : ; I j'I'R PH LK I RK F L P L YQMV^D R P PV P EDH K I L L I KTE P LH T RTVFAR WQDLL 
PQGLRHTAAD [ LEPTTQCSGD t YEFYG3T3EPIERIPLEFFTLEPYKEH3FFFYRDMLQE 
TLEGPQEVFRVFESIPECEDOAAMFtSKCoELLELSQDSWI XKPRI3PSDERHAREIQKH 
[EDQPCFPFLKAMFTDHIT5QGVLFSRYFPSA.3LKGMFLSNYSRYYL0HIYFQIPSPTSG 
EFFSNRDRS FLLDLYFAG rSVFWADLESKRLLQY rKRRNKDVGMFVPKHOAEQFAQSYFI 
G I HGSC L IAGDYDEFLRELLTGMHTLSQOFT T PEFPPQTPLAILTGGGSGAMELANRVAT 
ELSILSCGNLISLDTTNAYVF^KMSYAIPDLLERQADFHVDLAVFVIGGMGTDFELLLEL 
T 3 L KTG K KALV P VFL I GPVDYWK3 K I TAL YNSNHAVGT I RG S EWVHNC LFCLS S AKAG I A 

; .■--■■■■■■i.riirn.i '* *r n ; -vr r , ;rv* 

CPn_04 91 574 5V5 1 j/JJ>; 

CT389 hypothetical protein 

r LSSLYTVFTMKTAFHSCYSWFCWLFSFLVLFVGG I AGGEPLCPDCKYETKSVLRSDQLP 
DHLWNY ENDCYLTGYVQSLLDMHFLDSRTQW I EKNRAYLFSLPVDSSLS EA I TNFVRDL 
PFICAVE IC ERPYGEC ITRSSAERPLLPKEKTLGMPI FCGKEGVWLPQNT ILFSPLIADP 
RQVTNSAG I RFNEKWGNRVGATI FGGDF I LLRLFDVSR FHVDCDFG ICX3GVFSVFDLDH 
PESCMVNSDFFVAGLWSGAIDKWSFRFRLWHLSSHLGDEFILTHPNFPRFNLSDEGVDLF 
I S FRYTPQ I RLYGGCGY rVSRDLTFP ERP FYC EWGAELR PFGLREGNLHAQ P I FAMHFRC 
WEEXiKFGIJ>QSYILGMEWAKFOEIGRKIRAVLEYKCCFSKEGQFIREPCNYYGFRLTYG 

CPn_0492 574643 574804 

No robust homolog present in Genebank/EMBL as of 11/7/98 
LF S L I F P IC EERNSQQTYKHLHVES ACFLLES P LK I HWS S PYGF P PFYRRDLKL 

CPn_0493 575142 574855 

No robust homolog present m Genebank/EMBL as of 11/7/98 
SKTEGSHSKTSKGFVGRFVQWIRTFTGRGSKKRSPSSFSPTHPYrRLRTYTRSPKQSGVE 
RKQEDAETSF I ETPKG I LKKPGNKDPKGKHVHWKDS 

CPn_0494 575370 575146 

No robust homolog present m Genebank/EMBL as of 11/7/98 

VIMIRVNPYGSYRGPJ^PSPFXGKKrr/PLSGNSRI^RRGGIRP^KSASVGVTSGSKTGKA 

SLEKKVKGISEAHFK 

CPn_0495 575507 576793 

aspC-Aspartate Aminotransferase 

RIUjFCKNQKMAIQKAGAFLRCLPSESRPYLEHAMRJRNPHFSLIJCPQYLFSEISKKI^QFRK 
ENPEI SVIDLS IGDTTQPLCRS ITQAIKEFCVSOEKQETYRGYGPETGLEKLRTKIASEV 
YENR ISPEEIFI SDGAKPD I FRLFS FFGS EKTLGLQDPVYPAYRD I AH ITG I RDI I PLAC 
RKETGF I PELPNCXJSLDILCL^YPNNPTGTVLTFQQLQALVNYANQHGTVL I FDAAYSAF 
VSf^ifSLPK^IFEIPEAKYCAIEINSFSKSLGFTG^ 

lfxtt fngasllmqeagyygldlfpt p pa r sl yltnaqklkksletagf svhggdhapyl 

W^PEGISDEEAiTDFFLHQYHIAWPGHGFGSCGQGFVRFSALTOPQNIAIJ^CDRLCTA 
SLKiTMVLA 

CPpuQ496 576751 577812 

cre&i hypothetical protein 

PP&YRFTKRNDGSCMT ILRKLSQYLFFFSLFCSF IYVATCGSQPDSVSSPKI AIFLSFPH 
PLBEDCSKSC I ETLKDF ENLPE IWLNAEDS I VKARK IARSLHTDKNWAIVTLGT IATK 
VM|pJETQKPVIYAAVPDRESLTLPKNI^ 

KPSEPFPSDLQKEIVKKLHASGIEVIEISITSSTFKTRIRQAIDKRPSAIFIPLSPLSHK 
EGggTLQEILKEKIPI ITDDTSLISEGAC I ACSVDYKKSGKQ IAKI VHHLLYNNHDVDS L 
RKf 1 AQRLS PTTTFNEDI I KYLG IKLHKTERKQFLS FKS KKLEKSEKGKNVAVS 

CPnL'0497 578107 577820 

CT338 hypothetical protein 

IFQRVVLDDSWILEVKVTPKAKEl^IVGFDGQALKVRVTEPPEKGKAN^ 
LPKSbVTLIAGETSRKKKFLLPNRVQDI I FSLHIDV 

CPft3>498 579062 578085 

NolLiipbust homolog present m Genebank/EMBL as of 11/7/98 
YC R LRRAP FMNR RKARWWALF AMTAL I S VGCC PWS QAK S RC S I DKY I PVVN RLLEVCG L 
P EAESNVEDL I ES S S AWVLT PEERF SG ELVS I CQVKD EHAFYNDLSLLHMTQAVPSYSATY 
DC AVV FGG PLPALRQRLDFLVREWQRGVR FKK I VFLCGE RGRYQS I EEQEHF FDSRYNP F 
PTE|KWESGNRVTPSSEEEIAKFVWMQMLLPRAWRDSTSGVRVTFLLAKP 
DTLCLFRSYQEAF PGRVLFVSSQPF IGLDACRVGQFFKGESYDLAG PGFAQGVLKYHWAP 
R I ^LfTLAEWLKETNGCLNI SEGCFG 

CPn_0499 580404 579205 

No robust homolog present in Genebank/EMBL as of 11/7/98 

L3 VYLL I FYFCNC STMS SVNQ S SGTP N PEEVT S P EST EENKNWS S DEAQAT HAVALP I V 

TQ LS LP EGVGTS S EET AS N PRVD E I VAEVSS S RAVADQ 1 3 S LVERVG ELLDD LKGAQSLF 

TSFQSELKNCLPAWKSSTRRLETRGAGDNADIARLELFRSDYEAVLGHANQFHGKAHLIL 

SKLTDVHHKLQGLSREDLSLAFDNNDRVLEHLGSLGLDVDAEGNWSLSCERGIPRLVLTA 

DSMLVQIKKVNLPTVEELRTLQGTTESSSDPRVEESLSCCERLLNELRRLWANFVGFISS 

CYDNIVFVLMWIVRRINLLPGLGCLPFHNPDASQEDQRSSSGERSTRRERLSRRSDLSEE 

EMIVRAEGESIHPESPHGDGRNQPSRGDKQDSDSEEETEL 

L"Pn_0500 580647 5S2362 

proS-Prolyl tRNA Synthetase 

OPHSMKTSQLFYKTSKMANKSAAVLSNELLEKAGYLFKVSKGVYTYTPLLWRWSKMMNI 
rREELNAIGGQELLLPLLHNAELWQHTGRWEAFTSEGLLYTLKDREGKSHCLAPTHEEVI 
CSFVAQWLSSKRQLPLHLYQIATKFRDEIRPRFGLIRSRELLMEDSYTFSDSPEQMNEQY 
EKLRSAYSKIFDRLGLAYVIVTADGGKIGKGK3EEFQVLCSLGEDTICVSGSYGANIEAA 
VSIPPQHAYDREFLPVEEVATPGITTIEALANFFSIPLHKILKTLVVKLSYSNEEKFIAI 
GMRGDRQVNLVKVASKLNADDIALASDEE I ERVLGTEKGF IGPLNC PI DFFADETTSPMT 
M F VC AC NAK DK I iYVNVNWD R D LL P PQYGD FL LAEEGDTC P EN PGH P YR I YQG I EVAH I FN 
U.l T P. YT D/ » F EVN FQ DEHGQTQQCWMGT YG IGVGRTLAACVEQLADDRG IVWPKALAPFSI 
TEAFriGGDTVSQELAETIYHELOSQGYErLLDDRDERLGFKLKDSDLrGIPYKLILGKSY 
0 TFEI EoRwGEKYTVSPEAFPTWCQNHLA 

l;rr:A-fmi Tz .inr.ci ipt"Lonal RvpteTsn: 

[ LLTFfX^FP [ MLlTVT r VLVGLrMAR::KV<3KPD^K r LD [ L F ATT E L Y LKTGO PVGS KTLK 

k:;ki;:;i;l:;tat t knyfahleal\;flkknht ;ct 5 r [ e j t dla lr h yv d i i o e ec p eae i sap r 
i * dk i : ; r j r , v : i e : j i< n u k ni .o k at e l lg e i l d l pt f~ < ;:: pp f fn d.s vtn r o r tov dkq r avt 

! L:;'rEF-v;0[ K'tTJ-rL.WLL'CAL^rL.UKRIEKrLONTY rPKLPTNEEL;;KKEEHL.-;M.';LYNEV 
W^YI.TRYCNK.'JKEDLYUTf JM 1 :KL.L.KYI^\FKDPEVLALGL:lLFENRROMu ELLN IGMMKC 

hataf i' ;ki-:l: :l> r u :r :nf* x:^v it i pyymnr;;fu;alg i djp enlpykfalpllklfan 

K EMF/ri/l'Q.'IFYK T'K I <: irPRPE.Ti'NCKLilNFP fLRTCYVI I KLLPSKPTT . 
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rpn_0S02 ',H )»><U ^42C 1 

qrpE-H3P-70 Cot-lctor 

GD , yMTDTPPENEEQfHE3rr/0NF^E , /EHI^EI\'TLKTELKEKNDKYLMALAESENSRKRL 
QK ERQELMOYALENTL IDFLNP I ESMEKALCFATQM3 DDVKNWALG FNM I LNQFKQIFEE 
KGIIEYSS IGQKFNPFLHEAVQTEETSEVPEGT ILEEFAKGYKIGERPI RVAKVKVAKAP 
TPKENKE 

CPn_0503 534225 586213 

";■! - - . r. r \ . ■ t v ™[ ™r ' rv\ r '"".\EKLVGI 
PAKRgAVTN FEKTLGLTK P F IGKKYJ EVA J E I wTV P t T/TSGSKGDAVFEVTX3KQYT PEE 
IGAQ I LMKMKETAEAYLGETVTEAVI TVPAYFNDSORASTKDAGRI AGLDVKRI I PEPTA 
AALAYG rDKVGDKKIAVFDLGGGTFDIS ILEIGDGVFEVLSTNGDTLLGGDDFDEVIIKW 
MIEEFKKQEGIDLSKDNMAI^RLKDAAEJCAKIELSGVSSTEINQPFITMDAG^PKHLALT 
LTRAQFEKLAASLIFJ^TKSPCIKALSDAKLSAKDIDETvX 

EPt^GVNPDEWAIGAAIQGGVLGGEVKDVIX.LDVI PLSLG I ETLGGVMTTLVERNTT I P 
TQKKQ I FSTAADNQPAVT IWLQGERPMAKDNKE IGRFDLTD I PPAPRGHPQI EVSFDID 
ANG I FHVSAKDVASGKEQK I RI EASSGLQEDE IQRMVRDAEINKEEDKKRREASDAKNEA 
DSMI FRAEKAI KDYKEQ I P ET LVKE I EER I ENVRNALKDDAP I EK I KEVTEDLSKHMQK I 
GESMQSQSASAAASSAANAKGGPNINTEDL!OCHSFSTKPPSNTKjSSEX)HIEEADVEIIDN 
DDK 

CPn_0504 586418 588514 

vacB-ribonuclease family 

ATQFTSETTGFLVC^PKLTGGAQLLKKPKRKPGRRTYGKSLKIFIPGTLFVHARKGFGFV 

S PDNPEEYPFDI FVPARDLRGALDGDHVIVSVLPYPRDGQKLKGT I SEVLARGKTTLVGT 

ITSLVS PTSALAYTSMSGSQSL I PVELLPGRTYKIGDR I LLSTPPWVDKPQEGASPALQM 

LEFIGHITNAKADFQAIQAF^LAEEFPPEVIEEASLFSQKHITQVI^SRJa^I^LLCFT 

IDSSTARDFDDAISLTYDHNNNYILGVHIADVSHYVTPHSHLDKFAAKRCNSTYFPGK^ 

PMLPSALSDNLCSLKPNVDRLAVSVFMTFTKSGHLSDYQ IFRSVIRSKYRMTYDEVDNI I 

EKKHSHPLSKILNEMATLSKKFSDrREERGCrRFVLPSVTMSUm^EPVALIE^OT 

HKLIEEFMIJCANEWAYHrSHC^^SLPFRSHEPPNDENIjLAFQEIAKN^ 

PDYQYLLQTTS AGH P L EOVLH S Q FVRSMKTAS YST ENKGHYG LKLDYYT HFTSPIRRYID 

LIVHRLLFNPI>S:DQTHLEIIV71ACSTKERVSAKAI^SFF^XKKTRFIN1^ 

KAY I IT ANHEGLSFWTEFCHEGFI AAAELPKEYSLKKNALPES I PDKMKPGAS IKVTID 

SVNLLTQKIVWS I ATTTEDKPKK IKKTPSKKKGTKKRAS 

CPn_0505 538471 589106 

*3-methyladenine DNA glycosylase 
RKRLLRJ(KERKKEPRNVI£EHFFLSEDVITLAQQ^ 

G PDDKACHAYNYRKTOR^^lAMYLKGGSAYLYRCYGMHHLLNVVTG PED I PHAVL IRAILP 
DQGKELMIORRQWRDKP PHLLTNGPGKVCQ ALG I S LEKNRQRLNT PALY ISKEK I SGTLT 
ATARIG IDYAQEYRDVPWRFLLSPEDSGKVLS 

CPn_0506 589055 589840 

CT421 hypothetical protein 

CPMEISPI PRRFGKSF ILNNUCLYSKETNAHFL ISCRRIMRKYFITGLVILLPLAITr AI 
VTMIMNFLTQPFVGLASEFFFJCFSFYTKHRAI^JCFVLQII^ 

FKSLLS IYDKILHRI P I IKTVYKAAQGVMTTI FGSKSGSFKQWMVPFPNANVQC IGLVA 
GDAPWCCTGEKEDDPLVTVFIPTTPNPTSGFLTLFRKSDIVFI^DMKIEDAFKYIISCGV 
LSTPMACPSSPLPDELHQDQGS 

CPn_0507 589398 590122 

CT421.1 hypothetical protein 

ST PY PQF PLSGE IKKFNI ELFMTRMSKQARRRAKS P KKRKPKYA IVHPAPAPRIVYKLHT 
NALSTSDSIFIPKIG 

CPn_0508 590133 590300 

CT421.2 hypothetical protein 

SR IMSRHRSYGKSVKGVTKRNVLKRF ERVEVLRKLGRWNDSTAKKVTGLPKTP ILK 

CPn_0509 590299 590808 

(predicted Metalloenzyme) 

NKFVFLYGNF I RVTQEKI KIHVSNEQTC I PIHLVSVEKLVLTLLEHLKVTTNEI FIYFLE 
DKALAELHDKVFADPSLTDT ITLP I DAPGDPAYPHVLGEAFI SPQAALRFLENTSPNQED 
IYEEISRYLVHSILHMLGYDDTSSEEKRKMRVKENQILCMLRKKHALLTA 

CPn_0510 590804 591973 

tlyC-CBS Domains (Hemolysin homolog) 

QLNMLHILLAIFCILLFLAFGLTQPSCHGSSKFLKTLNQRFFKDKGREYPPFPSAPTILA 
TLLCILYGAI>3TKLYTLLPPKTAHKDLLFWPLYSLSALIAYGFLPPWrSTKVPKETTAHL 
RFLASVFQLGLFPLQLLFYRRRPNQQVRSSTSFQSQLSEALSAFDNLIVREVMIPKVDIF 
ALPEETTLQEALVLVS EEG YSRVPVYKKNLDN ITG I LLVKDLLLLYTSSHDLSQP I SSVA 
KPPFYAPEIKKASSLLQEFRQKHRHLAI ivneygftegiatmedi ieei igeiadehdvq 

FlJTPYKKIGSSWIVDGE^ISDAEEYFfrLKIDHENSYDTLGGHVFHKVGAVPQKGMRIHH 
ENFDI EI ITCTERNVGKLKITPRKRKFN I S 

CPn_0511 592141 592488 

rsbV -Sigma Regulatory Factor 

MSDrOKEEHGSTTIFHLHGKLDGISSPEVQENICOSLAAGSKNIILDCAHLDYMSSAGIR 
VLLQSYHQVGQHSGKrVLTTVPKTIEQTLYVTGFLSYFKIFNTVDEAIQTLNKDGD 

CFn_0512 5^2538 594412 

CT425 hypothetical protein 

SLPLTMRRSVCYVNPS I ARAGQ ISTWKFLYSLATPLPAGTKCKFDLAGSGKPTDWEAPAT 
DLSQTRrJVIYAEMPEGEI IEATAI PVKDN PVPQF EFT L PYELQVG ETLT IVMGASPNHPQ 
VDDAGNGAOLFAQRRKPF\'LY r dptgegnydepdvfsmd r PGNVLKK I E I FTPS ywknk 

rfdetvrfedefgnltnfsreetrielcyehlper^lnwqlfipetgfvilpnlyfnepgi 
yriqlknlstqe r f isap i kcfadsaprilmwgllhgeservdseenietcmryfrddral 
nfya^^sfenocnl^pdiwklinotv^lfneedpfitlsgfqysgephlegvrhilhtke 

TKSEErJKHKEYKHIPL.AKLYK.^TVNUDM [.'a T PSPTASKEHGFDFENFYPEFERWEIYNAW 
f SS.'iniTAALtfNPFP [(.X'KDSEDf 'R< ITV I EGLKKI JLRFGFVAGGLDDRGIYKDYFDSPQVQ 
YS ! -GE.TA I rCNKYTRriU.VI ALFARI !( ' fATTC PP I VLSFfJ TTCAPMCSELSTCSKPGLNV 
NRH I 'XJHVAGTALI.KTVF [ I RNGIJVU ITFFPD^rJTJLDYEYDDMVPLCSVTLKDP^KAPF 
VrYYI 1 P'/TOAfjN/\MAW , :.'r IWVDI ,N 

I-' i • ox LfJfjr r dm t i.U 1 

Vfier^GP.'jLKGVKNOL.-VNKKKIiVTKO .TVLutll ,ER [VHQ.'JVI !OM'I , TCLLK?PPKTSPLYS 
[Fi:KLI J iAOEPr J .;£;FDA[,!!t.I.I .r/rNKnr/vFTr^fPADOVPKQPVGDTVYY^^TE.YLYPTNF 
t'[if":('KF r ' f ;FYAKP(niPK^WI,Y'Jl'i)|>[.f /A'lL>nrKTPtTE*/HtVax:FPr;rNL^YY^DLF 
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TK [ KEY DF"fJ CM I KALTA TEYAYLGDLDNLo tRDVLLTLKDAGLDS I PGGGAE ILVDK IRN 
FLAPKR LG:TGDFLN I H KMAHQLG t HSN ETMLC YH KEG PEDLVTHMVKVRD LQDETQGFKN 
FI LLKFAQENNVLGKRLRKGGQGHA [ PLKSLMAVAR I FLDNFSNMKALWNYLGI EAALDL 
L5CGAN D LS GTf i MC EK VFQMA53 KE P T KM DAEGMAA L I TQQGRT PCLTN3 S HV 

CPn_0514 505690 596520 

CT427 hypothec ical protein 

GNGG PH HTT R EN AMGr IQ LQ PC r 3 LOCV 3 Y I NS F P LS L0 L I KRND I P CVLA P PADLLNLL I 

■ .(.'/. \:.t: : ; './■.*/• \APTFrr: ■ " ; \atl."Y*pg 

.:: :;.i."v:,:pnF.wi " ■■ . "":*/:.■_ i" >':r -^^lligdai' r.r';p.TYDL 

A3GWYDLTKLPFVFAtiLLH3TGWKEHPLPNLiAM£EAL0QFESSPEEVLKEAHQHTGLPPS 
LLQEYYALCQYRLG EEH Y £S F EK F REY YCTL YQQARL 

CPn_0515 596450 597181 

ubiE-Ubiquinone Methyltransf erase 

EKNTTKALKNSGNIMEPSTNKPDCKKI FDS IASKYDRTNTILSLGMHHFWNRSLXQ ILGS 
GYSIiDLCAGTGKVAKRYIAAHPQASVTLVDFSSAMLDIAKQHLPQGSCSFIHSDINQLP 
LENHSYPLAAMAYGLRNLSDPHKALQEISRVLMPSGKLGILELTPPKKTHPTYSAHKLYL 
RAWPWIGKSVSKDPDAYSYLSKSIQQLPKDHDLEDLFSKSGFYIAKKKKIJIjGAATIWL 

lekq 

■CPn_0516 598904 597255 

No robust homolog present in Genebank/EMBL as of 11/7/98 

RI S ISFRVSWFVKI ILAVLGRAIAKAYYVCMVARGLCDFPTLVPNERLPIGPFFVPQHTS 

GAKGKEFAKRNFSIISGLDDILKLCIU3RRPFALQWDNLSVKSDYEEAGPAIGIRSLEPQ 

VSOISPAHGRX^STLVQWAPILGSEEQLVWLEETTMKRIJCFPKSl^SKDAVIVDSEMVFVtl 

ANPTQE I PAAS ETVES SPVAPGNTTDTMPAASGTTDTTSGVS EAAAAEAAVDSTPGTEEE 

PSFSLRYALWQNVPYPEPPKEPEVMFTDEEKSLILEATRARRMELDLYNGYLiADYELSK 

D E I Q KHV PDL PENWRTNWRWS E RL YK FF F KT KKEG LEE I F LNKELGNM I LARGLAATQSQ 

AR I KVFWSLVAWU^SFNVGRSCTAKPLPTSKLDLFKSEFESKPKNNILTEFLVASDEE I 

LFKGLRVLEPG I EGWYDHPDGAG E I RSVLEGLVQAGR I SGYWENQPFGRFVLRGVGERRT 

ELVELLESLVASGEIMQFFESSDEEGAF 1 1 DNE PSKT AMLKQ RFKS CVRTKLVG S F ADE S 

LPRGRFTILV 

CPn_0517 599637 598795 

No robust homolog present m Genebank/EMBL as of 11/7/98 

F IMSSLLSCGRIEFTRVTCSLKTYLEDTSQNQLSTRLVRASVI FLCALLI ILVCVALSSL 

I PS IMAIATSFTVMGLILFVMSLLGDVAI I SYLTYSTVTSYRQNKRAFE IHKPARSVYYE 

gvrhwdlgrsslgtgeipivrtlfspfqnhglnhaijaakiflfmehfspeppneplvr>/a 
clirdfrphvsslcfviekqgssij?tkegni , iceafksdydahfamvix:yrlihskliie 
kjj4slknidi ipsvmvredypsrpgegyregllrmyggkgal 

CPJ^OSIS 600806 599832 

ci%29 hypothetical protein 

Ft^fTYP VPQNP LLLR I LRLMDAFS KS DDERDFYLDRVEG F I L Y I DLDKDQEDLNK I YQ EL 
EH^RYCLIPKLTFYEVKKIMErrFrNEKIYDIDTKEKFLErLQSKNAREQFLEFIYDHE 
AELEKWC^FYVERSRIRIIEWLRN^FHFVFEEDLDFTK^ 

AR£jiLL 

lf , £| > SSITFSEKFDTEEEFLANLRGSTRVEJ>2L^ 
DFFpDDDEKWTKTKGSKRGRfCKSS 

601707 600904 
dapF-Diaminopimelate Epimerase 
QJpf\KLRILVYWMAFYSPSTISKYFIYSGAGNRFLJ^ 

KPS SCADAQLI I FNS DG S R PTMCGNG LRC A I AHLAS QKG KS D I SVSTD S GLY SGYFYSWD 
RVLVI^LADWRASWRLESRPDPLPKEWCIHTGVPHAVVrLPEISTLDLS ILGPFLRY 
H^TFSPDGVNVNWQILGHCQLRVTITYERGVEGETAACGTGALASALWSNSYGIWKESIQ 
lH!I!!WGGEI>riVSQNRGRVYLOGSVTRDL 

CNL0520 602233 601646 

clpP-CLP Protease 

E$MYFMADG EVHKLRD 1 1 EKEL LEAR RVFFSE PVT EKS ASDAI KKLWYLELKDPGK P IVF 
viNSPGGSVDAGFAVWDQIKMLTSPVTTVVTGLAASM^ 

Hdj re I GGPITGQAT DLD I H ARE I LKTKAR 1 1 DVYVEATNQPRD 1 1 EKAI DRDMWMT ANEA 
KDJIfLLDGILFSFNDL 

CPEt-0521 603803 602241 

gl^-Serine Hydroxymethyl transferase 

KS LLKVFEKFKKFAI VEI FTKWAWSLLHKFLENASGKKGQSLASTAYLAALDHLLNAF 
PS IGERI IDEI^SORSHLKMIASENYSSLSVQLAMGNLLTDKYCEGSPFKRFYSCCEKVD 
A I EWECVETAKELFAADCACVQPHSGADANLLAVMA ILTHKVQGPAVSKLGYKTVNELTE 
EEYTLLKAEMSSCVCLGPSLNSGGHLTHGN^LJvJVMSK^ 

R LAKEYK PKVL I AG Y S S Y S R RLNFAVLKQ I AEDCG SVLWVDMAHFAG LVAGGVFVD EEN P 
. I PYADIVTTTTHKTLRGPRGGLVLATREYESTLNKACPLMMGGPLPHVI AAKTVALKEAL 
S VDFKKYAHQ WNNAR RLAE RF LSHG LRLLTGGTDNH MMV I D LGS LG I S GK I AED I LS SV 
G IAVNRNSLPSDAIGKWDTSGIRLGTPALTTLGMG I DEMEEVADI I VKVLRN IRLSCHVE 
G3SKKNKGELPEAIAQEARDRVRNLLLRFPLYPEIDLEALV 

CPn_0522 603825 604655 

CT4 3 3 hypothetical protein 

REPLSPEKTSLAFKVKNVNQRMIKKNQGKKKNYFQY I PLKVQKLRQPSFYPKRLMTLYLG 
LT4QKTARKYQAHYLP I LTLFPYAKST PQNKRALQFLPQATHVI LTS PSSTHLFLSRMTSL 
LSKATLKTKTYLCIGESTKERLLSFLGQVKYWATQEIAEGIFPLLQALPSSARILYPHS 
5JLARPVTREFLYNRFTFFSYPHYTVKPRKLKKNILSKYKKI IFTSP3TVRAFAKIFPRFP 
EKTYWCQGRMTLQEFQKFSSQKQVSLLETLGKSRT3P 

CPn_05 2 3 604720 60 5052 

Mo robust homoLoa present m Genebank/EMBL as of 11/7/98 
KMACSATPGFDGTAF^LFPPATRPRYNFKLALFVTIArALVWIALIATTIAIGLCIHPLC 
.jF I FLTA E PLYF ESRY I CTHYARNVY IALDWrDM'JKLQDMRCHSP T F3DR 

(:Pri_() r i^4 on r >l)7 ( l hOi.17^ 

No rnbu:;r In win km pn^nr m Genebank/KMBL .ic. ot 11 /?/•>« 
F7FYKKNVM:X:P::RT[\' VV^VLJYVPRDKCI APKKOFTI AK [ '/TLAILA.'JLALGALVAG 

! . ; lt i v r / ; n i *v f i . a i . i i • it \ r . [ ■ : ;wt r l vy m omt : : k v:; g nwu y v t . con f k p lc, kawq ekn 
v r / ; y r ; r j FMy f y n n i i ln p k f k va ryTDA:'o pfo ptfltglrv [ t KNy:nx; r efnpvgptnl 

I (/M'rA , I*tir,::Tt(/t'::TL.K!)K':VWm , f'KOKr:OGPAKCFr/pr:3F''rWRVVKLPNEALDOTFNL 

nlggak.kk:: i l.ptfu :iivo ; p k : : r. e l r Ny y e y y ru a l i a y fnc l kaj\ r e i ihaa i val p l f 

■r:;7VKVI'f J KF[Fl'KR rri'YWI)NU , [VjA[' , i;KHALl.DA[OrrrAI.I'YF'OR^LLVrLODPFNTEE 

:;y::i<:;t-:i-; 



t^Pn_0525 'Vj'iSlO ^0 7283 

CT39H hypothetical protein 

G I E FMHDALL3 1 LAIQELD I KM I RLMRVKKEHCKELAKVQ5LK5D t RRKYQEKELEMENL 
KTO r RDGENRIQEIGEQ INKLENQQAAVKKMDEFNALTQ EMTTANKERRSLEHQLSDLMD 
KQAGGEDLI VSLKE3LASTENSSSVI EKE I FE5 1 KK INEEGKALLEQRTELKHATNPELL 
S I YERLLNNKKDRVWP I ENRVCSCCH I VLTPQHENLVRKKDRL I FCEHCSR I LYWQESQ 
VNAQ ENSTAKRRRRRAAV 

VFNLIfcKdFOVC^UiLKLu^KENKMFirMtJTDVCQDiLOK^KhAVDFrFQAFQPKEAM 
QLAEKI LGHSGWVFFSGVGK3GCVARKLVATLQSLS ERALFFSPVDLLHGDLGLVSPGD I 
VCLFSKSGETQELLDTVPHLKSRRAI LVAITSMPYS^^^ALSDLVVILPSVAEIJDPFNL I 
PTNSTTCOM I FGDFLAMLLFHSRGVS LSTYGKNHPSGQVGMKANGKVKDFMFPKTEVPFC 
HLGDKVSFSLE 1 /FSAYGCGC T /C IVDPQFRL^K3IFTIX3DLRR3LASYGGEVI^I^LEKVMT 
ANPRCITEDSDIAIAl^IJ^ESSSPVAVLPVU^EENRKVTGLLKMHTLAK^ 

CPn_0527 609910 608726 

sucB-Dihydrolipoamide Succ my 1 transferase 

RYMI FEFRFPKIGETSSGGS I VRWLKNLG DHVARDE PL I EVSTDKI ATELPS PKAGRLVR 
FCWEGDEVASGEMjGLIELEEISEADDESTSCPPTSCETKSEAGSSSSSVWFSPAVLSL 
AQREGIGLDNLQKIAGTGKGGRVTRQDLZAY I SESQQVS I PEIr QG EVNRI PKS PLRRAI 
ASSLSKSSDE^HASLVVDVDVTDU^ISGERORFLOTHGVKLTrTSFIVCXTLAQTUlQ 
FPLLNG SLDGTT IVMKKSVNVGVAVNLNK EGVWPVI HNCQDRGLVS IAKALADLSSRAR 
LNKLDP S EVQDG SVTVTNFGMTGAL I GM P 1 1 R Y PEVAILG I GT IQKRVWRDDDS LAI RK 
MVYVTLTFDHRVLDGIYGSEFLTSLKNRLESVTMG 

CPn_0528 611165 609921 

gltT-Glutamate Symport 

LMKLWMKIFIGLFVGVTiyGLVLEDKAIFFKPIGDIFLNI^S>mT 

i*ikklgrigiksvglyijgttaiaivigix:^ 

aayflsi iaqvfpsnpvrsfaegnilqi i ifaiflgialrlsgergrpverfidgfseim 
lrmvnm i ms fapygvgasmawi sgnhglgvlwqlgkf i iayylaclfhatlvfgglvrfg 
ckmsfskflssmmdai scavstasss atl pvtmrcvsknlgvsaevsgfvlpiigatvnmn 
gtaifqgmaavf iaqaync pls ls sllllwtat fs avg sagvpgggm itlgsvlasvgl 
piqgi ailag idrlrdivgtpmnrlgdawatyvasgegelspyes ikqesvett 

CPn_0529 612298 611165 

ycaH-ATPase 

FSCKEIRAFKRGTMKKRFPSTLFLFYRRVTIAISLEGIL^^ 

RFSWSTPYRARSTVI SVGNIWGGAGKT PTVLWI^^JEALRLRG YSCGVLS RGYKSQS SRQK 
KLTVVIJSKVHSASYVGDEPIJL^tAEKLPEGSVWVHKDR^ 

KLHKDVEIAVVNGQDPLGGRAFFPKGRU^FPIJU^KTVDAI IVNGGGKEAGTVVKRVSNA 
PQIFVKPTIASWWTHNGERI PKEALRELRVGVFCGIiGFPQGFLNMLREEGIHIIXjKYLiL 
PDHAAITKKEIiNYFCQQMAMRQGQGLLCT EKDSVKLPRL SGEVSLL P IAKVEMRLSVNQD 
DTLSLLNMIEQIHKNRGN 

CPn_0530 613323 612460 

spoU-rRNA Methylase 

SWLWGKFLWRRCGSLAFWEFCSMDC IGKHNPLVKEALALKRSRCRKSSWFLVEGAREIQ 

KAIJlTGYI^QHVFCSTHLSEKEKEFLYELKRNSTKrLYCLCSTIAQLSFKEHHDSFVAVI 

QK^\WNKEI)FLIQRKNAQPFYLIIEQVEKPGNVGArLRIAIX;AGVDG^ 

NVVRSSl^AWSLPILSISREEGKELFKQEGWTVFVTSPRAETMYFSKNYI^ 

EKrcLTEIWSEDFSEIALPMLGESDSLNLATSVAAVAYEVVRQRWVN 

CPn_Q531 614198 613245 

SAM dependent methyltransf erase 

DSSKDDFRKEKGRRKSQYRDRYVNKDTGRHSKTYFSLIRERLVMDYKLLDSGIX^KLECF 
G PVTL I RPS S I AVWPKSRPELWSQAQLQYVREGERGAWKNFK RL PEEWEVAFS DVRCLLK 
RTPFGHLGVFPEHMGFWPALKQAI EKHKERQVLNLFAYTGAGS I FAAKCGARVTHVDASQ 
AAVRWAQRNVEKNAFPERRIFWVIEDVISFLKf<EIRRNKKYQVILLDPPSYGRGPDGEVF 
KIDKDLFPLLSLCSKLLADDASYFLLTSHTPGHTPEFLRAIAJRRSVPTLVSEAWSCGESF 
CG EGVGALPSGSFVQWIA 

CPn_0532 614716 614075 

ribC/r isA- Riboflavin Synthase 

ESFCCKDSWXWGGMFSGI iyELGEVCFFEAQGNGLSLG IKSTPLFVTPLVTGDSVAVDG 
VCLTLTSCNESKIFFDVIPETLACTTLGEKRCSDQVT^EAALKNKjDSIGGHLLSGHVFGT 
AEIFLI KENRYYFRGSKELSQYLFEKGF I AIDG I SLTLVSVDSDTFSVGLI PETLQRTTL 
GKKREGERVNIEIDMSTKIQVDTVKRILASSGKD 

CPn_0533 614918 615385 

CT406 hypothetical protein 

EVAPMQC PFCNHGELKVIDSRNAPEANAI KRRRECLKCSQRFTT FETVELTLQVLKRDGR 
YENFQESKLIHGLNAASSHTRrGQDQVHAIASNVKSELLGKQNREISTKEIGELVMKYLK 
KADM I AY I R FACVYRRFKDVGELMEVLLSATPDMEK 

CPn_0534 615389 615784 

dksA-DnaK Suppressor 

LN F T RS KW PLS DDE I EQFKKRLLEMKAKLSHTLEGNAQEVKKPNEATGYSQHQADQGTD 
TFDRTISLEVTTKEYELLRQINRALEKINESSYGICDVSGEEIPLARLIAIPYATMTVKA 
QEQFEKGLLSGN 

CPn_0535 615763 616296 

IspA-Lipoprotem Signal Peptidase 

KRTPIWKLSSMATRFRSTLLVITLFVLIDWVTKLWLLQYKDLQILTHPTLYTHSWGRFS 
FSrAPVFNEGAAFGLFSNYKYFLFLLRIFVILGLLAYLFFKKKSIQSTTCTAL'/LLCAGA 
rGNVGDI IFYGH IVDF ICFNYKOWAFPTFNVAD^/LISLGTLLLVYKFYFPTKQTEKKR 

CPr^O-j it, »>lt>300 f;i7601 

d.igA-f; A La, (1 ] y Permease 

YR:;i,Q£ KLR YFFKTVMNRLLSLLSVFDDFFWCY* /AF I LI IVLGVSF.jWK;iRr['*y(-TKFSO 
FCKLFF' ^V:;OMP0CRETK(X , ;VHPLKVFFASA(XJrirGIGrry , yG rVTAAC [ OG PG AL FWVW I 
A'; rrC'. r/KY::F.V , t , [AUKFRKLDRDGWyGGPW'^FLIKAFKTPVVr;VrVAr[,!/ IVL'VEI 
1 '0r':vrTDrTLAH( > WNI,PKV r \TMLGLLFL7P('AIPGGLQPIGKICr,r/LPFFM[.LY(^Lr;L 
^n//KrJllTLrHt 1 L:;TVF.'^7AFKGOnAI^GFAGCTVATTIHOGISRAAY'>;ij[^I^FD. ( ; [ 
iy';F.r;';AKDp':T0AyL.:J rvc EA tDNL ECTL^LLMVLASGSWSLGLENA^OWEHTLAiJYF 
I'MVKFFI.lTFnvr\'Y , n'I rSYFrA/GKKCAKFLYGNTGAKIYTLYGLLn.PLh^FL^ONT 
ALUEM r ;V>lALt,U'l' IML.UA'F [ Lf'KEV [ F P APAA G LT ETC LST E 

<"Pn_0S l / hi Ml', M H iH't 
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t'.THM.l I iypor her. toil pt'»tf i:j 

LrPLLFMDNYLLGSLI FCCVLL:: IGMCT I FVMT [CFLRQLNKILKNtHRVTTILNFEAKI 
LAPLMLi;KKLU:Gl^KPKNRr^^LJED[OCLLDEKKQRJWKKNLDQGIKWCAALVLIV^ 
FRNKD 

CPn_0538 hlB123 613511 

CT814 hy pot hot um! pntfim 

TKELNGAOHWririFCK^^/TKTKTLRDr'mFR^HKFKKTKCKRFRWLRGVLFCGFIATLL 

- v iff ,l>l 'KL- 

CPn_0539 618678 621545 

pmp_l 9 -polymorphic membrane protein 

GYNLLCLRHMKQMRLWGFLFLSSFCQVSYLRANDVLLPLSG I HSGEDLELFTLRSSSPTK 
TTYSLRKDFIVCDFAGNSIHKPGAAFLNLKGDLFFINSTPLAALTFKNIHLGARGAGLFS 
ESNVTFKGL HSLVLENNESWGGVLTTSGD LS F INNT SVLCQNN ISYG PGGALLLQGRKS K 
ALFFRDNRGTI LFLKNKAVNQDESHPGYGGAVSS I S PGS PITFADNQEI LFQENEGELGG 
A I YNDQGAITFENNFQTTS FFSNKAS FGGAWSRYCNLYSQWGDTLFTKNAAAKVGGAIH 
ADYVH I RDCKGS I VF E ENS AT AGGA I AVNAVC D I NAQG PVRF I NNSALGLNGGA I YMQAT 
GS I LR LHANQGD I E FCGNKVRSQFHS H INSTSPfFTNNAI T I QGAPREFS LSANEGHR ICF 
YDP I ISATENYNSLYINHQRLLEAGGAVI FSGARLSPEHKKENKNKTSI IMQPVRLCSGV 
LSIEGGAILAVRSFYQEGGLLAI^PGSKLTTGXjKNSEKDKIVITNL^ 
I RATEKAS I EI SGVPRVYGHTESFYENHEYASKPYTTS I ILSAKKLVTAPSRPEKDIQNL 
1 1 AESEYMGYGYQGSWEFSWSPNDTKEKKT II ASWT PTGEFSLDPKRRGSFI PTTLWSTF 
SG LNIASNIVNNOTLNNSEVIPI^HLCWGGPVXQIMEQNPKQSSN^ 
I PFSFNT ILSAALTQLFSS SSQQNVADKSHAQ I LIGTVSLNKSWQALSLRSS FSYTEDSQ 
VMKHVF PYKGTSRGSWRNYGWSGSVGMS YAY P KG I RYLKMT PFVDLQYTKLVQNP FVETG 
YDPRYFSSS EMTNLSLP IG I ALEMRF IGSRSSLFLQVSTSYI KDLRRVNPQS SASLVLNH 
YTWDIQGVPI/SKEAIJJITLNSTIKYKI^/TAYMGISSTQREGSNLSANAHAGLSLSF 

CPn_0540 621631 626862 

pmp_20 -polymorphic membrane protein 

F I HLIYLSL I EFVNISDRFSSMKWLP ATAVFAAVLP ALTAFGDPASVEI STSHTGSGDPT 
SDAALTGFTQSSTETD3TTYTIVGDITFSTFTNIPVPVVTPDANDSSSNSSKGGSSSSGA 
TSLIRSSNLHSDFDFTKDSVLDLYHLFFPSASNTLNPALLSSSSSGGSSSSSSSSSSGSA 
S A WAAD P KGGAAFY SNEANGT LTFTTDSGN PGS LT LQNLKMTG DG AA I Y S KGPLVFTG L 
KNLT FTGNE S QKSGGAAYT EGALTTQ A I VEAVTFTGNTS AGQGGA I YVK EATLFNALDSL 
KFEKNTSGQAGGG IYTESTLT I SNITKS I EFI SNKASVPAPAPEPTSPAPSSLINSTTID 
T S TLQT RAASAT PA VAPVAAVT PT P I STQ ET AGNGG A I YAKQG I S I ST F KDLT FKSNS AS 
VDATLTVDS STIGESGGA I FAADS I Q I QQCTGTTLFSGNTANKSGGG I YAVGQVTLEDI A 
NUl^frNNTCKGEGGAI YTKKALTINNGAI LTT F SGNT ST DNGGA I F AVGG IT LS DL VEVR 
FS'KffoGNYSAPITKAAShTTAPWSSSTTAASPAVPAAAAAPVTNAAKGGALYSTE^ 
SG^3pSILSFE^]NECQNQGGGAYWKTFQCSDSHRLQFTS^^C^ 
TGK^LFQENSSEKHGGGLSLASGKSLTOTSLESFCLIWWAKENGGGANVPEN 
TP£©NEPAPVQQ PVYGEALVTGNTATKSGGG I YTKNAAF SNLSSVTFDQNTSSENGGALL 
TQKAADKT DC SFTY I TKVN ITNNT AT GNGGG I AGGKAH FDR I DNLTVQ S NQAKKGGGVYL 
ED^IXEKVITGSVSQNTATESGGGIYAKDIQLQALPGSFTITDNK^ 
GI&SSGAVTLTNI SGTFGITGNSVINTATSQDADIQGGGIYATTSLS INQCNTPILFSNN 
SA&¥kKTSTTKQ I AGGAI F SAAVT IENNSQP I 1 r LNNSAKS EATTAATAGNKDSCGGAI A 
Al^TLTNNPEITFKGNYAETGGAIGC IDLTNGS PPRKVSIACNGSVLFQDNSALNRGGA 
IYfeEJI DI SRTGATFIGNSSKHDGSAICCSTALTLAPNSQLI FENNKVTETTATTKASIN 
NL^KlYGNNETSDVTISLSAENGS IFFKNNLCTATNKYCS IAGNVKFTAIEASAGKAIS 
FYR^yNVST KETNAQ ELKLNEKATSTGT I LF S GELH ENKSY I PQKVTFAHGNL I LGKNAE 
LSiW^FTQSPGTTIT?<GPGSVLSNHSKEAGGIAI^^W^ IDFSEIVPTKDNATVAPPTLKL 
V S RTNAD S KDK I D I TG TVTLLDPNGNL YQNSY LG EDRD I TL FN I DNS AS GAVTATNVTLQ 
GNlGAKKGYLGTWNLDPNS SGSKI I LKWT FDKYLRWPY I PRDNHFY INS IWGAQNSLVTV 
KQG iLGNMLNNARFEDPAFNNFWASA I G S F LR KEVS RNSDSFTYHG RGYTAAVDAKPRQ E 
FIimAFSQVFGPiAESEYHLDNYKHKGSGHSTQASLYAGNIFYFPAIRSRPILFQGVATY 
GYt£QHDTTTYYPSI EEKNMANWDSIAWLFDLRFSVDLKEPQPHSTARLTFYTEAEYTRIR 
QEKFTELDYDPRSFSACSYG^AIPTGFSVTCAIAWREIILYNKVSAAYLPVILRNNPKA 
TY |5(^STKEKGNVVNVLPTRNAARAEVS S Q I Y LG S YWTLYGTYT I DASMNTLVQMANGG I 
RFVF 

CPn™0541 627137 623003 

So & lute binding protein { -yebL-Synechocystis Adhesin Homo log) 
NNRjgYQTAFVMHKVIVFI FLTLYSLKSYGNDVIDKPHVLVS IAPYKFLVEQ IAEETCFV 
YA-E^NH YD PHT YELP PQQ I KELRQGDLVJFR I GEAFEKTCERNLTCQQVDLSQNVSL I QG 
KPCCNQHTTNYCTHTWLSPKNLKVQV 

EI LTITSKAKQRHI LVSHGAFGYFCRDYNFSQHTI EKSSHVEPSPKDVARVFRDIEQYKI 
SS VI LLEYSGRRSSAMLADRFHMHTVNLDPYAENVLVNLKT I ATTFSSL 

CPn„0542 628000 628737 

ABC Transporter ^TPase 

FMTIRI LAEGLAFRYGSKGPNI IHDVSFSVYDGDFIGI IGPNGGGKSTLTMLILGLLTPT 
FGSLKTFPSHSAGKQTHSMIGWVPQHFSYDPCFPISVKDWLSGRLSQLSWHGKYKKKDF 
EAVDH ALD LVG LS DH H H HC FAH LSGGQ IQ RVL LARALAS Y PE I L I LDE PTTN I DPDKQQR 
ILS I LKKLNRTCT I LMVTHDLH HTTNYFNKVFYMNKTLTSLADTSTLTDQFCCHPYKNQE 
F5CSPH 

CPn_0543 62R710 o29603 

(Metal Transport Protein) 

KSG I FM LS S L I R DS F P LL I LL PTF LAALG AS VAGGVMGT Y I WKR I VS I SGS I SH A I LGG 
rGLTLWIQYKLHLSFFPMYGAIVGAIFLALCIGKIHLKYQEREDSLIAMIWSVGMAIGI I 
F [ S RL PTFNG EL I NFL FGN I LWVT PSDLYSLG I FDLLVLG IWLCHTRFLALC FDERYTA 
LN H C S VQ LWY FLLLVLTA ITIVMLI YVMGT ILMLSMLVL PVA I ACR FSYKKT RIMFI SVL 
LN -[ LC S F SG I C I AYC L DF PVG PT I S LLMG LGYT AS LCVK KR YN P ST PS PVS P E I NTNV 

yhbZ-GTP bindLnq protein 

K: ':',VF f 1 1 KS.'JFFCLKKDKWV IMFVDO rTLELRAGKGGNGWAWRKEKYLPKGGPYGGN 
i;wN(/::;VI lEATT^VY ■ F EAY R N I P F L K A PDC Q ^GATNNRTGRnGKDL TVSVPTGTLLRD 
AK'Iv;iai.MDinVD^EIU.LV:;0flOK'/;KGNTrFKT^IRAPrKATPGKPC:EIROVELELXL 
f Mi LfiLV IF[ 'NA' ;K.';TL.KNTI.AHTEVKV^.AYPFTT[>APSLGLV[ 1 CKDRLYQKPWr IADIP 

- ; i t i-:r ;aj fjNKt :u ilpi'lhii r uKtllllfv idv ;krepnspeedletliheu{.'jhqpdfek 

KPMLVAI.NK I DUt .LPnnjLD LtJ'IHJKRl PSYTPVL [."GI,T( IEGVEX JLYRFFTORLAV 

' "! nJVA f, <,.)')0H o 10*. j <, 
r 1 .;'/ I,;; / t ihoimin. t 1 ft otiun 

THAI' !•'.!■' i ' V MAI IK K f lASI'N' IRD.'JK.'JKRL. //KVi IAf ;OKV;Tf 1 f A/HJRGTPWNPAQNVC 
t« Wil Am .FAI ,VU ; 1 VVMKKTNRTY F'lWFTOL 



r!2L-L21 Ribosoma[ Protein 

LSKQRLTLS I ERF IRKKLMEPYAVtQTGSKQYQVRJGDV IDVELLGFT/A.-DK^IFQDVL 
FVFDGTKASLG3 PT t ANAQVKA EY LS HVKG EKW A Y KY K KRKNYHRKHGHRQKYLRVK I R 
EILI 

CPn_0547 ^3158^ 6321^6 

yabB family 

, . , , ( - ... si"- - / > V^ip^'">jT-'pf J pi,-ptv;* " '?' ' \ ^ '-'?r' r ~r.~TK r> ' 

•'. : . 'eh\ *; m " "r^iv ' ' . ■ /• g: 1 :: 

EALKSLKPNOKIJHVAITIEGSKPKFLCKLJALRtJNIAOVKNLTKrDIGITATSGECLSD 
FGCGDGVQCFC/LTVMr^CD 

CPn_0548 633234 632191 

cysJ-Sulfite Reductase 

KMYLQEKFKA^VPLVLRELLSCSDSI^roSDPIYRMVFDS^^^TTISYKVGDALGVLPENS 

KEVSEHVLQLLGYSPTTLVNVKKTSEKVSAQKF IQGYVDLDK I PAKLKS FFPDKDPKITL 

YDAIQEYRPQIPIELFAESVFPLLPRFYSIASSPDLHPKSIELLVKHVSYPGKYQKRFGV 

CSSFL^SELQVNDSAYIFVQPTKHFTLSTQTEGKPLVMIGAGTGIAPYKAFLEEPiFT^ 

PGNNIXFFGERKEKVNFYYR^FWNHAZEEGKLKL^LAFSRERDQKV^ 

KAYEEGGFFFVCGRJ<VLGIEVKHALEEILGKDTLASLRKEHRYVVDVY 

CPn_0549 633662 633255 

rslO-SlO Ribosomal Protein 

PQDVQHQFWNQHSUJ^FlJCKFKKRIiRSKGCMKQCKQKIRIRLKGFDOGCLDRSTADrVE 
TAKRTGARWGP I PLPTKREVYTVLRS PHVDKKSREQFE I RTHKRLVD I LDPTG KT I DAL 
KMLALPAGVDIKIKAA 

CPn_0550 635688 633580 

fusA-Elongation Factor G 

LTTYGENNKFMSNQEFDLSAIRNIGIMAHIDAGKTTTTERILFYAGRTHKIGEVHEGGATM 

DWMAQEQERGITITSAATTVFWLGAKINI IDTPGHVDFTIEVERSLRV'LDGAVAVFDAVS 

GVEPQSETTVWRQADKYGVPRIAFVNKMDRMGADYFAAVES>1KEKL^ 

QFVGMVDLI S QKAL YFLDDTLGAKWE EKE I S EDIJCERCAELRANLL E ELAT I DESNEAFM 

MKVLFJ}PDSITEDEIHQVMRKGVIE>nQNPVI^CT 

NIRGI^IIJC^DQEISI^PRRMPLAALAFKI^^^DPYVGRITFIRIYSGTLKKGSAILNSTK 
DK KER I S RLLEMHANERTD RD EFTVG D IGACVGLKF SVTGDT LC DDNO E I VLER I E F PD P 
VIDMAI EPKSKGDREKLAOALSSLSEEDPTFRVSTNEETGQT 1 1 SGMGELHLDILRDRMI 
REFKVEANVGKPQVSYKET ITVSGNS ETKYVKQSGGRGQY AH\ r C LEIE PNEPGKGNEWS 
KI VGGV I PKEY I PAV I KG I EEGLNTGVLAGYGLVDVKVS I VFG S YH EVDS S EMAFK ICGS 
MAVKDACRKAKPVI LEP IMKVAV ITP EDHLGDV IGDLNRRRGK I LGQES SRGMAQVNAEV 
PLSEMFGYTTSLRS LT SGRATSTMEPAFFAKVPQK IQEE I VKK 

CPn_0551 636174 635698 

rs7-S7 Ribosomal Protein 

MYMSRRHSAEKRDI PGDPI YGSVILEKF INKVMMHGKKSVARKIVYSALERFGKKLNLEN 
VLEGFG EALENAKP I LEVR S RRVGGATYQ VPVEVAS ERRNCLAMQW 1 1 KHARSKPGKSME 
VGIATELIDCFNKQGATIKKREDTHRMAEANKAFAHYKW 

CPn_0552 636698 636219 

rsl2-S12 Ribosomal Protein 

IQAGYVPSS SENKPLPTKRALLY I SMLVWRLKREEYMPT I NQL I RKRRKS S LARKKSPA 
I^KCPQKRGVCLQVKTKTPKKPNSALRKVAWVRLSNGQEVIAYIGGEGHNLQEHSIVLIQ 
GGRVKDLPGVRYH I VRGTLDCAAVKNRKQ SRSRYGAKRPK 

CPn_0553 637753 636812 

No robust homo log present in Genebank/EMBL as o£ 11/7/98 
GCMWRWLRFLI IFILGRAVFPLRASESFSWETSTCLTv-LGI PFIDI I LTTN EDFVAQCG 
LQ IGTI SSTNNAKIKE I FL IYKEKFPEAS ISFKRKEPLNLSQSHLSDLG I LCMRNGETYA 
EGMANKENGPALKQPKDLRLVLRCPNOPDTLLYSEKEAEKGIETNTCLCNQGYTLLDGQL 
I LYGDS IEKFLKETKRKNNHTLVDLCDSQVVTTFLGRFWSLLNYVQVLFLSEDSAK I LAG 
I PDLAOATQLLSHTVPLLF IYTNDS I HI I EQGKESSFTYNQDLTEP I LGFLFGYINRGSM 
EYCFNCAQSSLGET 

CPn_0554 637806 638141 

CT440 hypothetical protein 

VF S YLLLC 1 1 LVYVRFMYEGKS RMAS PTPGQ LHLQQKVES KA YDYS RS LAM I ATALLFF I 
VALILSGLSLLPQVFLPFSGAYF I IGSFLAF IALG I LLINCVCDLKQYLTSS 

CPn_0555 638298 640241 

tsp-Tail -Specific Protease 

mfvmkk lvrlc wlls llpnvlfs s dllr eeg i kkmmdk l i eyhvdaq evs tdilsrsls 
syiqsfdphksylsnqevavflospetkkrllknykagnfaiyrninqlihesilrarqw 
rnewknpkelvleassyqiskqpmqwsksldevkqrqralllsylslhlagasssryeg 
keeqlaalclrq i enhewylg indhgvamdrdeeayqfh i r wkalahsldahtayfsk 
dealamriqlekgmcg igwlkedidgvwrei ipggpaaksgdlqlgdi iyrvdgkdie 
hlsfrgvldclrgghgstwldihrgesdhtialrrekilledrrvdvsyepygdgvigk 
vtlhsfyegencvsseqdlrraic<;lkeknllglvldireotggflsoaikvsglfmtng 
vwvsryadgtmkcyrtvs pkk fydg pla i lvskssasaae i vaotlqdygvalvvgdeq 
t ygkgt i qhqt i tg da sqd dc f icvtvg k yys psgksto lqg vks d i l i ps ly aedrlge r 
fleh pl padccdnvlhdpltdldtqtrpwfqkyylpnlqkqetlwremlpoltkns eqrl 
sensnfqaflso i kssektdlsycsndlqlees in ilkdmillqocrk 

CPn_05 56 b4 0 921 640325 

crpA-L5kDa Cy^reine-Rich Protein 

ENGMSSNLHPVGGTGTGAAAPESVLNIVEEIAASGSVTAGLQAITSSrGMVNLLIGWAKT 
KF I Q P I P ES KLFQS RACQ I TLLVLG I LLWAG LACM r I FH SQLGANAFWL 1 1 PAA I GL I K 
L L VTS LC FD EACTS EK LM V FQ K WAGV L EDQLD DG I LNN S NK I FG M V KT EGNT S RATT PVL 
NDGRGT PVL^ P LVil K I -\RV 

rpii_()b c ^ (.i?H/f) *,4ir»4 

omcB 'OkD.i /y.t riin.'-rurii (jM[* 

E [ PM:JKL t RKWTVLAI^T' "MA^CFAGf *fj t CAAV/AESLITK IVASAKTKE^E'Vl'MTAKKVR 
LVKRNK'jI'VFCK.'Hi lAIVDKKFV[>r Efyjp.c 'Qt'VnAQOE^CYGRLYSVKVNUX 'NVH ICQS 
VPEYATVJ:ir\ l'[i:[[.AU:KKDf VDWrTQQLf^-'EAEFV^DPErrrriDnKLVWK IDRL 
' JAGDKC'K [TVWK['[,K[I *a \ *I-TAATVCAr PELR.'JYTKOnQPA rc iKora ITTX "AC'LHCPVC 
7 K I EV VT ITf I . : \ I A tl MVTV r iN p V P U ] Y [] \ i H V I >: : FNLG DMR P^.PK KV rT V V, I f ' PQ U ft 

o itnvat'/tyl'v;; iHKi-.-.ANvrry/iiRr-' vijvni ;adw.';yvckpvey.: tsvsniv idlvlh 
uvv iyi/rLi"'^VTvr,i' ah k :\-irr mkv/wh t k [*m * et lo f k l w k a<j v [ < ;m-fNOVAV 
t ' ; E. f i N( v ;tc t : 'c a ! r r r n i wk< ; t j\a* rn m' • v i . i ttndp i cvr ; cnt \r{ r i c \rrtt w : . : a e i /rr jv • : 

LII.KI-IKELOPIA : ^irTKtrr [ 7;MTV7i-'DAi.PK[;WKE:;VEF:;VTI,Kt: [All ipahgha [ 
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UWDTl .T.' >I 'V.'TDTENTHVY 

<:Pn_f;')*>H Miit>> *>4JQ3L 

omcA-9kD.i vstmne-Rich Lipoprotein 

KLMKKAVL rAAMFCGWjL.'I.XCRIVDCCFEDPCAPSSCNPCEVTRKKERSCGGNACGSY 
VPSCSNPCGSTECN^QoPQ-VKGCTSPDGRCKQ 

'TnJ ,r ''^ M '.710 MTH7 

■Tl-i ! . : nyr-.r !• ■ i • ■ t --n 

■.NNKYTUaJKriLv lu'^,.. i ,i . li t. " > " i . .TPR.Vi P , ' '.vNKAH.'LC 

FFLARSVFNTCYNTNL 

CPn_0560 645666 644098 

gltX-Glutamyl-tRNA Synthetase 

RNS RFQGMKSLWSKDKR I MNWENVRVRVAPS PTG DPHVGTAYMALFNE I FAKRFKGKMI L 
R I EDTDRTRSRQDYEENI FSALRWCG IQWDEGPnVGGPYGPYRQSERTKIYQGYVETLLK 
T DCA YKC F AT PQ ELAEMRA VASTLGY RGG Y DR R YR Y LS P E EVAS REAAGQ P YT I RLKVP L 
SG ECVF EDY SKGRWF PWADVDDQVLVKS DGF PTYH FANVIDDHLMG ITHVLRGEEWLSS 
T P KHLLLYEAFGWEPPVFLHMPLLLNPDGTKLSKRKNPTS I FYYRDSGYVKEAFVNFLTL 
MGYSMEGDEEVYS LERII ETFNPRR I GKSGAVFD IQKLDWMNKHYLNH EGS PECLL KELQ 
GV^LNDEFFLKILPLCQSRITTIAEFI^TSFFFSGIiEYKVEELLPQAI^PEKAAIIXY 
SYWYL'EKTDQWTKETCi'LGSKWLAQAFNVHHKKAX IPLLYVAITGKKQGLPLFDS IEIL 
GKPRARAKLVYAEKLLGGVPKIO^TVDKFMQREDFEEATFDL 

CPn_0561 646407 645871 

euo-CHLPS Euo Protein 

LMACEQHEGCYELEEREEIEDIKDSDTKWVSITQAAKLHW 

TRWEIDI KDLEEYKRNRYSRKKSLYCGELVFDNGKGCYS INQVAQI LG I PVQKVYYATRT 
GT I RG ERKGAAWVI HVS EI ERYKNEYLSKQAAKKLKGAEPKEHQAPNFE PPT E I FPESN 

CPn_0562 648051 646918 

♦CHLPS 43 JcDa protein homolog_l 

NYKVIMSIAIAREQYAAILD^PKPSIAMFSSEQARTSWEKRQAHPYLYRLIjEIIWGVVK 
FLLGLIFFI PLGLFWLQKICQNF ILLGAGGWIFRP ICRDSNLLRQAYAARLFSASFQDH 
VSSVRRVCLQYDEVF I IX3LEIJILPNAKPDRWMLISM3NSIXLEYRTVLQGEKDWIFRIAE 
ESQSNIL I FNY PGVMK S QGN I TRNNVWS YQACVRYLRDE PAGPQARQ I VAYGYSLGASV 
QAEALS KE I AEG S DSVRWFWKDRG AR STGA VAKQ FIGS I/Tw^WLANLTHWN INS EKRSKD 
LHCPELFIYGKDSOGNLIGDGLFKKETCFAAPFLDPKNLEECSGKKIPVAQTGIJ^HDHIL 
S Di^i KEVAGH IQRHFDN 

CPE0563 ■ 650113 648293 

recJL-ssDNA Exonuc lease 

QYi&LLWDFSPKGPCG IKFMTNSDNASAAGLLWAHPKEDPAFLGMI IKEFHLPPTVAQIF 

ISRGFQTIQEIHKFLYSHLSSLYDPGLFLOTSKAVERIJ^LARDRKEHVMIYGDSDVIXSOT 

GVAMiVEFLRCIDVHVSYFFI^AILKQHGETSTLIAKLKEEGITLLITVDCGITAGKEVS 

DITKQG IDVI ITDHHMPTGKI PHCVATLNPKLPJ3HTYPNRELTGWGVAFKIARGVLNAL I 

S R&EV PKS QGSLKKLLDLVT LG T I TDVGVLLG ENRVMVRYG I KE I ARGAR PG LNKLCALC 

GVE^EVTSTDIVLKIAPECI^SI^RLDDPAKGVELLLTQDDERVDALIMELDNINRERQR 

I EAEVFQDVQEI LNSNPEI LKQAAIVLSSTAWHARVI PI ISARLAKTYNKPWI IAIQRG 

IGggjp AllTIGSFPLLGVLKKCSSLLLSYGGHDFAAGVIMKEDKVEDFKKKFVHLVN 

KG $£LP HLE I DAYADFDAI DYDLLASMELFEP FGKGNLMP I FYSKVRQVRYPKVL PGNHL 

KLYXSQKERNLEGVAFGLGRHADALKASWHYPLEIAY^ 

SEPRFSD 

CPB^0564 654359 650145 

selD&secF-Protem Export Proteins SecD/SecF (fusion) 

SGA63KQKVKRNFAI 1 1 CVFALALYYVL PTCLYYAK P LDKK IDGNEAEH 1 1 KS FTKQAQQV 
RKBVI PRVSAILSSLHLRGH IQQHPAI PDIVSVRFKRGEDAEDFIGNLVHGEPNVPIKSA 
RLfWGYSREHDDHVIQVASSINTSLVESDFSFVSYSSENEOEMASSILQRVYSACTFPK 
QKTCSCSYPSIWETAPKEQLLQYAKNLSSGFEVFSSRLSAFCQQSFSSNQDRLAFLSRLS 
SL^ffiAAIDVEDQKLLKSVYETLSQTACIRSLDCPY I EGLRLDCSESSLFFS S IEYCPKE 
R K I F.LTLH S DLLAQRT S L S KEQRLDF DSRIAVEKQKLS KNLTVQVEDYNNGFS FQWMDKD 
TQGf 3 ILQGERLLQGIAEHLTALTLHRPAAESCDLI PENFPVFCRQPRESEAFGCYIFSP 
OTI)Q^FSKGSVYILLKGLRSIVAKYQQGGGKEI^SFEKDLQNLYNCFSHTEAISWTI^ 
DQ^Ce I RH P LQQ FLDVWGEGFV IGKEGCAFLEVKD I QDRLATVNQI EKNRQ S DLVRWH EQ 
YRHAKCSMDLQERLSAPIPYQNLFLENMKLNMRKFSRGENILRIGIDFVGGRQLLLSFKD 
HQGKQLTDKEDI LKVSDELCARLNKLGVSEIELRREGDY IHLSVPGSSTISSSEILGTSK 
MS FHWNERFS S YSAS RYEVQRF LDYLWFTSQAQG KTS P EE INTFASALFNEEVDVPPS V 
HEAITKLKSEGLAFSPSGCETPSTDLITrTFSMIAIGKDAEQKANPLVIVFRNYALDGASL 
KDIRPEFAAGEGYVU^FSVKin'SPKKMAEKLSPTESFHTWrSAYCQEGISGTAN^YSAN 
RGWRMAWI DGYMVSS PILNVPLKNHASVSGKFTHREVSKLASDLKSGAMSFVPEVLSEE 
T I SSDLGKKQCTQG 1 1 SACCGLAMLI VLMSVYYRFGGVI ASGAVLLNLLL IWAALQYLDA 
P LTLSGLAG I VLAMGMAVDANVLVF ER I R EEFLLSQS LK KS VEKGYTKAFGA I FDSNLTT 
^/LASALLFFLDTGPIKGFALTLILGIFSSMFTALFMTKFFFMLWMNKTQHTQLHMMNKFV 
G I KHDFLRGCKKLWAVSGS VFLLGCVALG FGAWNSVLGMDFKGGYAFTFNPKEHG I SDVA 
OMRGKVVHKLQE^GLSSRDFRIOTFGSSEKIKIYFSDKALSYTI<ADTSLSPKINDHELAL 
AVGLLS ETGLDFSTETLNETQNFWSKVSSKLSKKMRYQAT IGLLGALAI ILLYVSLRFEW 
OYAFSAVCALIHDLLATCAVLFIAHFFLKKIQIDLQAIGAUfTVLGYSLNNTLIIFDRIR 
EDROANLFTPMHVLVNDALQKTFSRTVMTTATTLSVLLMLLFIGGSSVFNFAFIMTIGIL 
LGTLSSLYIAPPLLLFMVRKENRSK 

CPn_0565 655741 654 53 3 

CT44y hypothetical protein 

WKLFCFLIFCFVNISAILFDSSFLLKIKRNSKRMLRSMKFPRISISDLIPTQMVIWWRGG 
GNVH YVPNAQNLPKK I LGGVLACFGLALLGCAAFAAGVCQT I F PC IGLM I LGLVLLGFAY 
LQYSKGW5RFERPLFRETKVFEKPINWLGCLSLLQSWKKIRPGCYYHPGCPQVEICEGSQ 
EIVTK r FQKKSDRNTS I FL IQEMDQI ALRQGIEKSSLSRKTFAIDPSWSSLLSEIQREE 
vOY LD PKV I SWSSEDQASDRTH PKSA I YVN I G DAAQEPQGRCY I DAYTKAFFTVLDQ IGD 
PN rVKKHTT YVLTPI LGVPDALPKEE0ENLKLL5QAAFLYSAEQVAKRMREEKQDS IRIK 
FIFTDPTSPTrJLYFiJPHHSSTPHSVTPISLSGFVGEQESYTFA 

<:pn_oSM, bSt>0'»'> hSftfjou 

/■>< t.jm t iy 

KH [VfYALlNlJPVDI .SI ■ATNNAELSKFP^UiPLPNHVA I IMDGNPRWYKKHREECCHTHTS 
' ;J I YV ;AKV[ .PH r LNAVt ,t)L,C tKVLTLYTrOTENFGP PKEE IQETFN I FYTQLDKQLPYLM 
\\til'. IL'L1« ' I { 11 IKLPKt ; KJ'VK INMV:;RMTA.iP;7PLELVLiAVNYOf";KDCLVRAFKKLHVD 

r lnkk f :::JrjlH.:;F^:ta:::;YLPP::fXTDPDLL[RTry;EMRv:;NrL[ J WQ[AYTELVITDTLW 
I - lil-T i -OI )[ .FF.A [ NVYOOR: !RRt * 

':iii_n',(,7 t.ShU'M f.S7>u/ 

'.■'l::A l'li(» : ;ph,if i ( Ut . • rye uty lyt ranster av 



VI^SWFKSKT^AYGDLFC'P'/VAA{SUa>TFLVLLLY3^LFPLT';F f \LOFITATCGAVGTY 
EYSSMAKAKMH*^PLSTFSAIGSFLFLALSFL.G [RWHSLPOFFDALPVvTLLIWVVWSIF 
RVRKST IGALGL3GVTLF3ILYVG I P I RLFLHVLY5 F I HTQEPYLG IWWAGFL I ATTKGA 
DI FGYFFCKAFGNKK I APQ 1 3PNKTWGFVACCLCATL ISFt FFLO I FTRFASYFPMPAI 
LI PLGLALGITGFFGDI IESIFKROAHLRNSNKLKAVOGMLDTLOSLLLSTP IAYLFLLI 
TQSKEFIG 

rpn_056R 657 805 6534b4 

: -\ >■! ; j- i i-L' , ' . !v 1 , ' ir '■ • . 

k'.v.vm' ' "rr- . rr:> .vr~\s ri,t ir *i ""r.v >" M : \ /"Ai'Lo nr^TKru 
EEPPFSFTFATOQPLESFFriGHIiTSEiTTQEVANAASEl^OLPEVRAFMODLORRYAQL 
GNCVFEGRDMGSKVFPNADLK I FLTSSPEVRAQRRLKDLPEGTLSPEQLQAELVKRDAAD 
AQRAHDPLVI PENGIVIDSSDLT I ROVLEKILALLFRNEL 

CPn_0569 65Q398 659099 

plsC-Glycerol-3-P Acyltransf erase 

LFGFDNKTSSGENFSFTISKRAMIFR ICKFFTWAFSLFYKLKVYGVKKNFIKGPAI IAV 

NHNSFLD P I ALHMCVH EC I YHLARAS L FN I PWLWKQWGC FPVRQDEGNS AAFK I AS RLFN 

KRKKuVIYPEGARSPDGQL^PGKVGIGMMAAXSRVPIIPVTIRGTFEAFN^ 

T I TCVFGTPMYFDD 1 1 QNP E I KNKETYQ I ITNQTMNKI AELKAWYESGC KGDVP 

CPn_0570 659044 660789 

argS-Argmyl tRNA Transferase 

TKLPSSKHGMMRGAKETS PKLMSTLLS ILSVICSQAI AKAFPNLEDWAPEITPSTKEHFG 
HYQCITOAMKLARVLKKAPRAIAEAIVAELPQEPFSLI EI AGAGF INFTFSPVFLNQQLEH 
FKDALKLGFQVSQPKKII IDFSSPNIAKDMHVGHLRSTI IGDSLAR I FSYVGHDVLRLNH 
IGtWTAFGMLITYLQENPCDYSDLEDLTSLYKKAYVCFTNDEEFKKRSOTNWAI^ 
PQAIAIWEKICETSEKAFQKIYDILDIVVEJCRGESFYNPFLPEIIEDLEKKGLIjTVSNDA 
KCVFHEAFS I PFMVQ KSDGGYNYATT DLAAMRYR I EEDHADK 1 1 IVTDLGCSLHFQLLEA 
TAIAAGYLQPGIFSHVGFGLVLDPQGKKLKTRSGE 

LTDEAIQERAPVIGINArKYSDLSSHRTSDYVFSFEKMLRFEGNTAMFLLYAYVRIQGrK 
RRI^ISQLSLEGPPEIQEPAFFTiT^ALTIXJ^PEALESTIKELCPHFLTDYLYNLTHKFNG 
FFRDSH IQDS PYAKSRLFLC AIAEQVIATXjMHLIiGLKTLERL 

CPn_0571 662179 660749 

murA-UDP-N-Acety lglucos amine Trans £ erase 

TFEKVNVS FSDFDAKG ERRMQ I AQWGCGRLl^EVKVSGAKNAATKLLVAS LLSDQKCTL 
RNVPDIGDVSLTVE1CKSLGAHVSWDKETEVLEIYTPEIQCTRVPPTFSNW 
ALLCRCPSjVYVPTVGGDAIGERTI^FHFEGLKQLGVQISSDSSGY^ 
LPYPSVGATE^IL^AAIHAKGRWIKWALEAEILDLVLFI^KAGADITTDNDRTIDIFG 
TGGLG SVDHT IL PDK I EAAS FGMAAWSGGRVFVRNAKQ ELL I PFLKMLRS IGGGFLVS E 
SGIEFFQERPLVGGWLETDVHPGFLTWQQPFAVLLSQAQGSSVIHETVHENRI,GYLHG 
LQHMGAECQLFHQCLS TKACRYA IGNFPHSAV I HGAT PLWASHLVI PDLRAGFAYVMAAL 
IAEGGGSI I ENTHLLDRGYTNWVGKLRSLGAK IQ I FDMEQEELTTS PKSLALRDASL 

CPn_0572 662349 664616 

CT4 56 hypothetical protein 

I MAAP I NQ P S TTTQ ITQTGCl'rri"!"!' VG S LGEH SVTTTG SG AAAQT S QTVTL I ADH EMQ E 
IASQDGSAVSFSAEHSFSTLPPETGSVGATAQSAQSAGLFSLSGRTQRRDSEISSSSDGS 
SI SRTSSNASSGETSRAESSPDLGDU3SLSGSERAEGAEGPEGPGGLPESTI PHYDPTDK 
ASIliNFXJCNPAV&KMQTKGGHFVYVDEL*^^ 

? ADLEMC I AKFCVGYET I HS CWTG RVK PTME ER SG ATGNYNH LMLS MKF KT A WYG PWNA 
KESSSGYTPSAWRRGAKVTTGPIWDDVGGLKGINWKTTPAPDFSFr^^PGGGA:HSTSHT 

GPGTPVGATWPNVNVNLGGIKVDI^INI^ITT^ 

IT STGSQST I EEDT IQFDDPGOGEDDNAI PGTNTPPPPG P PPNLSS SRLLTI SNASLNQV 
LlQ^A^QHIuNTAYDS^KNSVSDLlNQDLJGQVVKNSENGVNF PTV I L PKTTGCTDPSGQATGG 
VTEGGGHIRNI IQRNTQSTGQSEGAT PTPQPT I AKIVTSLRKANVSSSSVLPOPQVATTI 
TPQARTASTSTTSIGTGTESTSTTSTGTGTGSVSTQSTGVGTPTTTTRSTGTSATTTTSS 
ASTQTPQAPLPSGTRHVATISLVRWAAGRSIVLQCGGRSQSFPIPPSGTGTQNMGAOLWA 
AASQVASTLGQWNQAATAGSQPSSRRSSPTSPRRK 

CPn_0573 665413 664691 

yebC family 

VEDMAGHSKWAKTKHRKERADHKKGKIFSRIIKELISAVKLGGADPKSNARLRMVIQKAK 
ENN I PNENI ERNLKKATSAEQKNFEEVTYELYGHGGVGI IVEAMTDNKNRTASDMR IAIN 
KRGGSLVEPGSVLYNFARKGACTVAKSS IDEEVIFSYAI EAGAEDLDT EDEENF LV ICA P 
S ELASVKEKL I SQGATCSEDRL I YL PLRLVDC DEKDGEANLAL I DWLEQ I EDVDDVYHNM 
S 

CPn_0574 665978 665394 

No robust homo log present in Genebank/EMBL as of 11/7/98 
SAERGFRHPIVMVETVLHNFQRYLSKYLYRVFRFPCRKKTFLSSHRVLARPSFPVDYCPG 
K I YDLQ E I Y EELNAQL FQGALRLQ IG WFGRKAT RKGK S WLGLFH EN EQ L I R I HR S LDRQ 
EIPRFFMEYLWHEMVHSWPREYSLSGRSIFHGKKFKEYEQRFPLYDRAVAWEKANAYL 

LRG Y K KRVGGG YG RA 

CPn_0575 ooo524 665982 

YhhY-Anuno Group Acetyl Transferase 

S I FGRVWRS FMTAEKQNTG I LGLEI RYTL PS DATYMLKWLNDPK I LRGF P I QTEAE IRET 
VNFWVG FY R Y H S S LTA\A'NGNVAGV ATLVLN P YVKVS HH AL I S 1 1 VG E E FRN KG IGT ALL 
NNLIHLAKTRFKLEVLYLEVYEGNPALHLYQRFGFVEVGRQNRFYKDEIGYLAKTTMEKD 
L 

CPn_0576 6c7543 666494 

prfB- Peptide Chain Re Lease Factor 2 (natural UGA tramo- shift ) 
MOENLDKRLEALRTEISLAARSL 

CPn_0576.1 no75 f -ifj 

prtB- (natucal UGA : t ^rne -shL tr ) 

MOENI.DKRLC.M.RTE ISLAAR::L 

CPn OS 77 

C'lE'M^QKMKNSArMHPVN I^TDI^AV [ V^jK( IPMPRTE TVKKVWf :Y IKKUNC 'OOQKNKRN I L 
PGANLAKVFG.V'DP [ DHFOMTKAL.'JKI I [ VK 

y.j" 1 -phosphohydLO L,i*.« • 

TTNiJFUVLIS r:"LiATLP [ LAI' -V/A.S! r FVtM ,PTTA I rWPt.PKKMAHLI K ILK [ AO L :lO\Ai 
Kl IKPVPEKFLNKV: :K.1 IKNF' jF'L-J , rVF r r\\ i\ A .CRARLELKERLETFLHTl.EAPU IVFA [ I , 
GMHUY. l J:;Y[:lRNTKGf:[TCT['FFK::iM'rfj|vA I [AVMVILF.SSP^YRYDPNLTFVFPMPDI. 
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LKLr.KNTPLTLLHNTTHVrPOTLNIVGLGDLFAROFHPEQAFKNYDPSLPGLLLSHNPDG 
tTRLWYPCDFVLSGHSHGPQVTLSWPKFARKFFERLSGLENPYLARCT^P/rKEGKQLYV 
NRGLGGLKR [RFCSPPEICYITCGYD 

CPn_057O 669 310 669993 

ygbP/y.icM -Sugar Nucleotide Phosphorylase 

KEFAoAPLLKGAT^HVPMIKSSLILLSGGQGTRFGSKIPKQYLPLNGTPL'/LHSLKILSS 
l.PO rAEVTWCDPTtXJETFQEYPVSFAIPCERRQDSVFSGLOQVT^PWVI IHDGARPFIY 

• : ; ,;,ktahi' i>.,. vrn r**: fopn^vt .rrJpNLAi p ' : . :^z:i,?FnLA 
[.Ai'-iuvji : ^a: i lote ;vLyiwHivi • / : ' i'pldlt e aoa— 

CPn_0580 669936 670793 

truA-Pseudoundy late Synthase I 

ASSNQNFLPRRSNDCPSPPMTKVALL I AYQGTAYSGWQQQPNDLS IQEVI ESSLKKITKT 
RT P L I ASGRTDAGVHAYGQVAH FRAP DH PLFANANLTKKALN A I LPKDI VI RDVALFDDN 
FHARYLAIAKEYRYSLSRI^PLPWQRHFCYTPRHPFSTELMQEGANLLIGTHDFASFAN 
HGRDYNSTVRTIYTt^rVDKGDSLSIICRGNGFLYKMVRNLVGALU3VGKGAYPPEHLLD 
I L EQKNRREG P S AAPAYGLS LHHVCYS S PYNNFCC EQCSVSTSNEG 

CPn_0581 671S33 670745 

Phosphoglycolate Phosphatase 

EXSLRWRSVKSFLRQCWIYSMLVSDEFQLCI^SGMYLEDYDVFFFDLDGLLVDTEPCFYRA 
FLQAC AEFS LEVHWDFSTYYSHTTLGTE I FSKKF I EQYPQAQ EYMAE I FAKR LQ I YYKSL 
EHAGPALMPGVEAF I ELVL S LNKT FGWTNS P RDAT HTLRTMYP I LNKF L FWVTRENY AR 
PKPYGDSYDYAYRTFAREGMKVIGFEDSVKGLRALSKI PATLVC INSMAE IT PEDYPELK 
GKEFFSYPSFDVLTEHCSQQKLL 

CPn_Q582 671305 672177 

CT465 hypothetical protein 

KNPNALLKK I QH RLVKMHDKNKVLYLQANHLNQKRKRHN PLNTYHSSNTTET RRLPTYYK 
SNIVLKMILRISTVSLLTSCSFSKNSRTCFVTPERITSQKDCPVLLHPKSTTISPPLYEW 
I SPNREVITAYSFYCRGQGNS I ITPEGVLYDC DGLHHSITKEEFRY IHPRLI EWRLLQQ 
DH PKVS 1 1 EAFCC PKH FH F LEASG I SLSQLHLQGTAATFALDPPLPMEKLLAT I KKLYKK 
NSDPSLSNFIVTEIATLTNPEIJ^TQQDIXSSHTEITVEII^NI^NKEALSSA 

CPn_0583 672349 672717 

CT466 hypothetical protein 
IVLSFFLGKTKVTPRFLMNERTLLLLLKKKKGLFLA 

IFLSCIDRVDLQIKEFRHAFSSELPQDIQEELEEIRDVI IRILDTDKRNYAQKKKEFGIY 
ERP 

Cfej0584 672659 673798 

ats.$S/ntrB- 2 -Component Sensor 

IRTOTMHRKKRNLVFMNVPDSKNLHPPAYELLEI KARITQSYKEASAILTAI PDG ILLL 
SE^HFLICNSQAJ^IIXJIDENLEII^^FTDVLPI^I^FSrQEAI^UCVPKTLRLSL 
CKESKEKEVELFIRKNEISGYLFIQIPJSRSDYKQLENAIERYKNIAEI^KMTATLAHErR 
NP^SGIVGFAS I LKKEISS PRHQRMLSSI ISGTRSLNNLVSSMLEYTKSQPLNLKI INLQ 
DEFSSL I PLLSVSFPNCKFVREGAQPLFRS IDPDRMNSVVWNLVKNAVETGNS P ITLTLH 
T^G^ISVTNPGTIPSEIMDKLFTPFFTTKP^NGl^LAEAQKIIRI^GGDIQLKTSDSAV 
SEFIIIPELLAALPKERAAS 

C®iGi0585 675880 673865 

*s'i*Milarity to Cps IncA_2 

igis^RKII^PNNFSIGDCSSNMATPAQKSPTFQDPSFVREIX3SNHPVFSPLTLEERGEMA 
I ARVQQCGWNHT I VKVSLI ILALLTILGGGLLVGLLPAVPMF IGTGLIALGAVIFALALI 
LGLYDS QGLPEELPPVPE PQQ I Q I EDLRNETREVLEGTLLEVLLKDRDAKDPAVPQVWD 
CEKRIjGMLDI^RREEEILYRSTAHLKDEERYEFLLELLEI^ 
QC*IJ&TVRSEEGEKEISRIX2DLISL^ 

H DREAS QRAC EGT EMDC AE RGQLEKD LRRQLKSMQEWI EMRGT I HQQ EKAWRKQNAKLE R 
LQE13LRLTG I AFDEQSLFYREYKEKYLSQKLDMQKI LQEVNAEKSEKACLESLVHDYEKQ 
LE^KDANLKKAAAVWEEELGKOQQEDYEQTQE I RRLSTF ILEYQDSLREAEKVEKDFQEL 
Q^R^SRLQEEKQVTCEKILEESMSIHFADLFEKAQ 

WVLTD S ASL S QKK I RE LVE EKQ ELLKALAFKS NELTQLVADA VEAE KE I SKLREH I EEQK 
EGLKALDKMHAQAI KDCEAAQRKCCDLESLLS PVREDAGMRFELEVELQRLQEENAQLRA 
EV|^LEQEQFQG 

CPnf|)586 675993 677183 

atd£7ntrC- 2 -Component Regulator 

KEKMNPS RG ENMAI KN I LWDDEPLL RDF LS ELLTSQG F I PDT AENLRNALQMI RSRDYD 
LV ISDMSMPDGSGLDL I KI IKQSS PHTPVLWTAYGS I ENAVEAMHQGAFNYLTKPFSS E 
AL F AF I S KAEELKNLVH EN LF LHSQTT PDSH P L I AESKAMKD LLA I AKKAAS S SAN I F I H 
GESGCGKEVLSFF IHHNS PRANH PY I KVNCAA I PETLLESELFGHEKGAFTGATTKKAGR 
FELAHKGTLLLDE ITEVPVNLQAKLLRAIQEKE I EHLGGTKTLSVDVRI LATSNRKLKEA 
IDDKSFRQDLYYRIJWIPLHLPPLRDRQDDILPI^^FL^FCRMNNTPLKTLSPKAQEL 
LLNY PWPGN I RELSNVL ERW ILENTS LLTE DMLALA 

CPn_0587 677378 678124 

*yvyD_Bs conserved hypothetical protein 

SYGELF I LSTLLKHHVTLGDKMRPHRKHVSSKSLALKQSASTHVEITTKAFRLSMPLKQL 
ILEKSDHLPPMETIRWLTSHKDKLGTEVHWASHGKEILQTKVHNANPYTAVINAFKKI 
RTMANKHSNKRKDRTKHDLGLAAKEERIAIOEEQEDRLSNEWLPVEGLDAWDSLKTLGYV 
PASAKKKISKKKMSIRMLSQDEAIRQLESAAENFLIFLNEQEHKIOCIYKKHDGNYVLIE 
PSL.KPGFCI 

CPn_0588 678033 678626 

CT4 69 hypothetical protein 

TSKSIKSNAFIKNMTATMSLLNLPSSQDSAGEDSTSQSQIFDPIRNRELVSTPEEKVROR 
LLLJFLMHKLNYPKKLI I IEKELKTLFPLLMPKGTLI PKRRPDILI ITPPTYTDAQGNTHN 
U3DPKPLLL r ECKALAVNQNALKQLLGYNY3 IGATC IAMAGKHSQVSALFNPKTQTLDFY 
rGLPEYSQLLNYF ISLNL 

rPu_n5H') h78b34 670395 

("I'lVO hypor.lietLc.il protein 

::SMyi^'VT(lW[ J R:JRPLLJKNlITLTPLFTPEGLrTFFAKQ(;0'rLOCDYRETLVr-r^LGKYT 
I .HKN'';r;RI J PKliTH(JD [ LNAFEAIKOTYALLEA130KM £QALLJ\:X2WKEKrSHKLFSLFLNr 
IJIIUr j E:;: J *NrEFF/\A[FVLKLLQYErjILDLTPACCLCKA^[ J PYACYHY^\^!KLCKKHOHK 

oa [ [ ekeeeu i lqa [ ehakqfsella iaefp f a iaek i tylfd: 'loeekk^crn^fjedp 
yiieii.pi.::kwhpy 

'■[•n_0 r /tl) i.HUnn b7 [ ) t J l'i 

CT47 L liyixit ln:t umL pror^in 



L F L YCDHNLG FACP Y L F F F I VL F AGG 3 FGNC L V F< "WL : ' E E C J FYT ; iPfCF~K5Y PDME 
NME IQAORKKRVEFtlLTGEFPKLETLNYO^S FOHLRAKCRGVY PVLYALNF^CSSCKMDM 
DFRGKWNRS3TITI3NQKES INLKLPKOVGV I VNTKT'ILKCNVC PGSTF [KO/TWGVWNKI 
YUNDLVCFSEVTLIFNVSSEOGTITFS 

CPn_0591 680364 681020 

yagE tamily 

rMRCTAYCTASAYNLHVLFHLLKPRYPT I I^REYVLANLDGTQASNQLA I FFPFGVAV 

\.':\*wi;rrr.~?:rs"' "* ' ; n "*"*' i*i v\-:'i j f "L:'LrFA..'VNL, 

H S D I LD E P DF FWDH p ETQA I Y RDV L j'C LD I EAR I NV L I V 

CPn_0592 681132 681461 

yidD family 

LYSKMFSMS FKR FLQO I PVR I C LL 1 1 YLYQWL. I S PLLGSCCRF FPSCSHYAEQALKSHG F 
LMCO^SIKRIGKCGPWHPGGIDMVPCTAL^EVLEPYQEIDGGDSSHFSE 

CPn_0593 682494 681391 

CT474 hypothetical protein 
VLGAKCMAFXFJCTRWLWQVLILSVGLNMLFL^ 

VYLSEDFLNEISQASLDDLISLFKDERYMYGRPIKLWALSVAIASHHIDITPVLSKPLTY 
TELKGSSVRWLLPNIDLXDFPVILDYLRCHKYPYTSKGLFIilEKMVQEGWVDEDCLYHF 
CSTPEFLYLRTLLVGADVQASSVASLARMVIRCGSERFFHFCNEESRTSMISATQRQKVL 
KSYLDC EES LAAI.LLLVHDSDVVLHEFC DEDLEKVI RLM PQES PY SQNF FS RLQHS PRR E 
LACMSTQRVEAPRVQEDQDEEYWODGDSLWLIAKRFGI PMDKI IQKNGLNHHRLFPGKV 
LKLPAKQS 

CPn_0594 682517 684958 

pheT-phenylalanyl tRNA Synthetase Beta 

OTCHYT0VIVKSLVKTSLRLSSMRIPITLLOTYFSEPLSTKE ILEACDH IG I EAE I ENTT 
LY S FAS VITAKI LHT I PHPNADKLRVATLTDG EKEHQWCGA PNC EAGL IVALALPGAKL 
FDSEGOAYTIKKSKIilGVESCCMCCGADEljGLJ}EIX?IQERALLE^ 

NTSI^ISLTPNI^HCASFlXrAREICHVTQANLVrPKEFSFENLPTTALDMGNDPDICPF 
FSYWITGISAQPSPIKI^F^I^AI^QKPINAIVDITNYIMLSLGQPLHAYDASHVAIX»S 
IJIVEKLSTPESLTLLNGETVLLPSGVPVVRDDHSLLGLGGVMG 

AYFLPEALRASQKLLPI PSESAYRFTRG IDPQNWPALQAAI HYILEIFPEAT ISP IYSS 
GE ICRELKEVALRPKTLQRILGKSFS IEILSQKLQSLGFSTTPQETSLLVKVPSYRHDIN 
EEIDLVEEICRTESWNIETQNPVSCYTPrYKLKRETAGFLANAGI^EFFTPDLliJPETVA 
LTRKEKEEISLQGSKHTTVLRSSLiPGIXKSAATNLNRQAPSVQAFEIGTVT 
ETC/riAILLTEDGESRSWLPKPSLSFYSLKGWVERLLYHHHLS I DALTLESSALCEFHPY 
QQGVLR I HKQ S F ATLGQVH P ELAKKAQ I KH PVF F AELNL DLLC KML KKTTKL YKPYA I Y P 
SSFRDLTLTVPEDIPANLLRQKLLHEGSKWLESVTI IS IYQDKSLETRNKNVSLRLVFQD 
YERTLSNQDI EEEYCRLVALLNELLTDTKGT INS 

CPn_0595 684943 685926 

CT476 hypothetical protein 

RD YQ FMKQL LFCVCVF AMS C SAY AS PRRQ D P SVMK ETFRNNYG 1 1 VSGQ EWVKRG SDGT I 
TKVLKNGAT LHEVYSGG LLHG E I TLT F PHTT ALDW Q I YDCG RLVS RKT FFVNG L PSQE E 
LFNEDGT FVL.TRWPDNNDS DT ITKPYF I ETTYQGHV I EGSYTS FNGKYS S S I HNGEGVRS 
VFSSNN I IJjS EETFNEGVMVKYTTFYPflRDPES ITHYQNGQPHGLilLTYl^^ I PNT I EE 
WRYGFQDGTTrVFKircCKTSEIAYVKGVKEGLEI^YNEQEIVAEEVSWRNDFLHGERKIY 
AGGIQKKEWYYRGRSVSKAKFERLNAAG 

CPn_0596 685930 686457 

ada -methy It ransf erase 

FAVMADDTL I PKLMKNSLSQACSEGLL I AKYP PLQV I VH FDNNLVVKTHLSVAPVFSCLF 
LGPAAHKAMQEIVLWCSRYANKEHPPFSSHFAKDL I PSQYLE ILNCVAE I PFGEQQTYAE 
IAKKTDTHPRWGAACKQNPFLLFFPCHRWGSHGERNYVLGPVIHEILLKFENSY 

CPn_0597 688215 686479 

oppC -Oligopeptide Permease 

MQKHPSFYORFLSAYYKNLIASLSWKFFISVALIGIYAPLFASSKPLLVTWHGEIFFPLL 
RYLFFPGYYTKPVDLFFNVLMVTFPFF ILSFKLTRGWLRRWLLGLX 1 1 SOCM I FAWAYSG 
fC/QDPALAENLKKMRA EKVREN I SKVNS EMVMLL PK DT RTWE MERR YMSTY EQ LG I L I KA 
KYRKKQ EAS VKKYQVAFEEKRQS PM PTLR HLEMKNEG IC LKRLQQRVDKMQR PYEMAQQA 
WNRATDNYRPFLMALTRIEHELRIADYNNW^PEDLCIAYANVEKRAEPYKKSLLEIRQV 
LEDYAKLRSAISFIQDKRLWIEKESEDLRILINPFFSSFHWEDDAGGSREMNKYVPWWQL 
SRVTRKDLLAALVFG I R I ALWAG I G I T I ALA IG IM IGLVSGYFGGTVDM ILSRFTEIWE 
TMPVLF I LMLVI S ITQQKSLLLNTVLLGC FSWTGFSRYVR IEVLKQRDRGYVLAATNLGY 
SHYYIKVHQILPNAIVPVISLVPFAMMAMISCEAGLTFLGLGEESSASWGNLMREGVTGF 
PAES AVLWPPAI I LTMLL I A I AL IGDGVRDALDPRLQDS 

CPn_0598 689712 638219 

oppB-Oligopept ide Permease 

EEGGSVLKYILKRLVLIPLTLFAIVSINFVILNAAPGDVLEEKSRDALGEAGKSDKMRSY 
KGPDRYI^FREHYGLTLPIFFNTRPKITHKKIQTALQELANAWNTTPSAKNAAKSL\/YWG 
EXTAKFVMPALLF EADDASRDDKYRH I AADLF I RGGVLQG FVGPNLS PEQRAQNKEI AESN 
AFLVROLNEEDLLTTKVEALKGWFQDHGGTEVFCYSSKOFWKTFFLETRFARYMSRVLRLD 
FGTL RN DAH KTV ISEVIKRLRCS LVLS I L PM I VGFV LCQ I FGM I MALKRN RW I DHS LNF I 
FLILFSIPVFVAV^ ILDNFVIMKTI PFTTIPMPYSGLRSPPEVFNELSTLGRIFDLVSH 
GFLPFCAVSYGALAAQSRLSRS I FLEVLSQDFICAAKARGLRWFDI LYKHVGKNAAVS IV 
TSLASSLGTLI^ALWETLFNITCFGNFFYQAILNRDHNVVLFSVLVGSALSLVGYLLG 
DICYVLLDPRVQLEGRRI 

CPn_059^ 6^1323 ^89682 

oppA-oligopept ide Brnding Lipoprotein 

KRRECGHMYKRCVLPKILKOrVACSLILLYWSGDLLERDIKSIKGIWRDrQEDIREISRV 
VKOO f jT.^QAEPAAr\lV'ML/\PKLVRDEAFALLFGDPSYPNr J LSLDPYKQOTLPELLGTNFH 
PMr;iLRTAHVGKrCKL.3PFNCFDYWGF"f DLC I PHLA:JPHVGKYEEF^ PDLAVKIEEHLV 
ELX;rJO0KEFHIYLRFMVFWRPILPKj\LPKHViJLDEVF0RPHPVTAI!DrKFFYDAVMNPYV 
ATMPAVALRSOYEPVV^VSVWILLKLWr'WKAfnVTNEF/lKEERKVI.YSAFSNTLSLQPL 
PR FV^OY TANCEK 1 1 EDEN IDT tPTNX I VIAQNFTMHWANNY t VS( "(jAYY [■' M IMDDEK tVF 
'JKNl'Of YDPU\ALiIDKRF\ATK'E'JTn:"IJ'OnFKT'';K [D I 'jYLPPNQRDNrY'-.FMKiJ^AYN 

KovAKf •/ ;avr rrrv:- \drayty r ' J >MT:;i,rn.) , :i"jvRi -AMriMA i dk fr l i iucluwzyt 
i ■ :( ; i 'FA'::;:' p:;v nkc r ecwi i y : r-ur.AAi' l i aa :i :< ,w t L7ii x ;rx ; [ a kk v i n« ;v t vpi-'RFRLC 

YYVK.T/TAHT [ Ar>\"\' YrAL'KE Vj I IT\IA/\U >MAI;L::ijAI'TjKKNKDAL[Mi mt/itpped 
PRAUWIinEXlAMr-IJv^'.XNWiIFMIIEFADK f IDHI, ;YKYUf ,KKPNFH .YHRKMK I [MEF.APYA 
M.t SPHC.^LLYKtnVKN I rVFTMI'TElL r l-f WU ,TVNVr'MVWLI-:KKEDlH'U;'r:; 

«-i-n_'j^)o *.".m',^ .'.in.:/ L^fcylr 
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HG Y MK I KK 3 FQ Y 3 LCQAKR FQMM L PNH FD PC LQ PVNLOL KQ D R LAYGEL 1 1 L LS KYQQ KT 
FSSLLKEETTCrjLNRAKQHLLYKILRDFrn'MQHLRSLGLNCWGEIPMSPCL 

CPn_060L 693092 b92736 

CT483 hypothetical protein 

OFPRIHADDt INSMDErTPNYPLLRQDSLWNR'/RVSWRADLSVSSRYEIASAIAILGLLV 
AFCASAAVS I r FTANP LAQVF I DGCLALG LLPIPLV XGLL I IG I IVLLYG I YLFPQQRE 

DSGFMKPU5FQENLEALCNKTSRQLLKYLIKQILFVCGASLLIALEFSFFLYFFLFSGKT 
VIPAFCLACFFLTLFVCLVTRLYLLSGKGDFFEDLASEYLQGAVPPNKRSQNIVEEQSHL 
AAAATKLSINLQNQEYSLLS EI FKFLPKHDLI RKFSCFC FWKDYFLFRECLLQKAI EAYI 
KWQAI PVDLSAHVSLADAYVALSGLYADPRKYPEFDANYWI PSGRYSAE IQEKFFATAR 
RA I EEFQ I LNEYAPGNAWVHAQLAYS YHDLQM PMEEIQEYE I VLKLKPNDVETMSKLG I L 
Y FQOGMNAKGLR I YEE IKKRDYKK SQfCL I KFYGVEYKY 

CPn_0603 694136 695185 

heraZ-Ferrochetalase 

WKIMRLIVLMGCLVSLFLiAKKVTVTTPAYIJjANFGGPRHAKDLQ 

LPRVLHRHLFTFIAKKRVPKVLPQYQSLQNWS PIYFDTETLAKTLSE I LRAPVI PFHRYL 
PSTHEKTLLALRTLHTRHVIG I PLFPHFTYSVTGS IVRFFMKHVPE I PISWI PQFGSDSK 
FVSLITCHIRDFLQKLGILEKECCFLFSVHGLPVRYISQGDPYSKQCYESFSAITTNFKQ 
SENFLCFQSKFGPGKWLS PSTAQLCQNIDTDKPNVI WPFGF I SDHLETLYE I ERDYLPL 
LRSRGYRALRI PAI YSSPLWVSTLVD I VKENSTWAEELI KSGKKHTG I R 

CPn_0604 695981 695196 

f liY-Glutamine Binding Protein 

CKKRQNSEAQLNVKIKFSWKVNFLICLLAVGL I FFGCSRVKREVLVGRDATWFPKQFG IY 
TSimiAFLNDLVSEINYKENUIINIWQr^^ 

FSDPILLTG PVLWAQDS PYQS IEDLKGRL IGVYKF DS SVLVAQNI PDAVI S LYQHVP I A 
LEALT SNCYDALLAPVI EVTAL I ETAYKGRLK 1 1 SK PI^ADGLRLAI LKGTNGDIiLEGFN 
AGL VKTRRSGKYDA I KQRYRLP 



GRFFNIK3PKQLSDILYNELGLRP LDKAKJTPACVLCALRjrHr t * L'KLLAKRT IEKLLS 
T YVKAL PKQVDS HTQR IHPSFDQTGAVTORLACRDFNLCN I P r R3ERG I LLRKAFRLSEK 
NSYFLSADYSQIELRFLAHLGQDKGLKFAFE^GCDtHAFTAb-OVFUVPLEQVSKEQRMQA 
KTVNFG IVYGQQAFCLAKVLK I S IGEAQEL 10 AYF3R Y PE I AHFVEET IQQAAKDLRVTT 
MLGRER 1 1 DSWNEFPCSRAASGR FAVNTR I Q03AAEL I KLAMLD I SQA I KQOQMKS RMLL 
QIHDELLFEVPEEEIEEMCRLVREKME3AMTL3VP tWNI L IGKNWAEC 



CPn_0605 696737 696150 

yhhF-Methylase 

LRKLC SSRGDVR I LAGKYKGKSLKTFSNPHIRPTSGLVKEAFFSICRED I EGAAFLDLFA 
GMGAIGFEALSRGAASWFVDI SIKAIQL IHTNSALLGEQLPWI FRQDAQSAIQRLIKQ 
KR S FDLI Y I DPPYELCNCYVETLLQK IVSGN I LNPEGTLFLENASDEE I AC EGLTLRRRR 
KLGKTYLAEYIVEKDP 



CPH10606 697492 696707 

CT3 ; 88 hypothetical protein 

S SYS RRQLRFYTGSLQMH I YGLADLH LALGVP EKTMEVFGDPWI GYHQK ICS EWQAWH P 
EOl?^LPGDISWAMNLSEAHKDFAFIGDLPGTKYMIRGNHDYWSSASTSKILOALPPSLY 
Yl^^FALLTPHIAWGVRLWDSPTICVKKENFLTPSTQE^SYTEQDFiCIFLRlIjGRr^ 
AtMLPKEVTEVIVMTHYPPISSDGTPGPISEFLEADGRVSLCLFGHIHKVQRPrDGFGN 
IRQ|HYI LVAADYVNFVPQEVM 

C|t£0607 698910 697573 

glfC-Glucose-l-P Adenyltransf erase 

NR;R-IQMIENDFPFASNFESSHFYRDKVGVIILCGGEGKRLSPLTNCRCKPTVSFGGRYKL 
IIjfMlSHAISAGFSKIFVIGQYLTYTLOQHLFKTYF"YHGVLQ 

QGf'fADA I RKNLLYFEDTE I EYF L ILSGDQ LYNMDFRS I VDTA I RTHVDMVLVAQ P I PEKD 
AY^ftMGVLDI DSEGKL I DFYEKPQEKEVLKRFQLS S EDRR I HKLTEDSGDFLGSMG I YLFR 
R0SLFSLLREEEGNDFGKHLIQAQMKRGQVQTLLYNGYWADIGTIESYYEANIALTQKPH 
AEKRGl^CYDDNGMIYSKNHHLPGAIITDSMISSSLIXSGCVINTSHVSRSVLGIRSKIG 
ENSWDQSI IMGNARYGSPSMPSLGIGKDCEIRKAI IDENCC IGNGVKLQNLKGYIKYDS 
PQiqaFVRDN 1 1 I VPQGTH I PDNYIF 

CPn^0608 699690 599016 

•Iffrdine 5 ' -Monophosphate Synthase (Ump Synthase) -truncated? 
VS'FJuYFVKNGRRLWRMMNY EDAKLRGQAVAI LYQ IG AI KFGKH I LASGE ETPLYVDMRLV 
ISSPEVLQTVATLIWRLRPSFNSSLLCGVPYTALTLATS I SLKYNI PMVLRR KELQNVD P 
SDASKVEGLFTPGQTCLVINDMVSSGKS I IETAVALEENGLWREALVFLDRRKEACQPL 
GP^IKVSSVFTVPTLI KAL I AYGKLSSGDLTLANKIS EI LEI ES 

CP!fe[u609' 699672 699986 

CT4 90 hypothetical protein 

QNTKNSLI RENMLI RLFLG I SLPKGF PLYLEP PLVLATFQGTQFVGTYSEATNPLY I DNL 
NLNYHYTQELLYKAVPCNYKSIYREIPLIIFPEVLIGSTPTQSTE 

CPn_0610 701450 700029 

rho-Transcription Termination Factor 

R I FLRFKGS IMKEERSSEI LPRVKETKKHAWSMQEKSCVGECAWASESEEAESVTVTK 
rAKLQP^GIEELNILARQYGVKNIGSLTKSQVVFEIVKAKSERPDELLIGEG'/LEVLPDG 
FGFLRSPTYNYLPSAEDIYVSPAQIRRFDLKKGDTI IGTIRSPKEKEKYFALLKVDKING 
STPDKAKERVLFENLTPLYPNQRIVMEMGKDHLAERVLDLTAPIGKGQRGLIVAPPRSGK 
TV I LQ S I AH A I AVNN P D I V L I VL L I DER P E EVT DMI RQVRG EWAST FD EQ P ERH I QVAE 
MVIEKARRLVEHGNDWILLDS ITRLARAYNTVOPHSGKILTGGVDASALHKPKRFFGAA 
RN I EGGGS LT I LATAL I DTGS RMDEV I F E E FKGTGNMEL VLDRRLS DRRTY PA I DL I KSG 
TRKEELLYHPSELERVYLFRQAIADLTTIDAMHLLLGRLKKTNSNAEFLLSLKE 

CPn_0611 702133 701420 

yacE-predicted phosphatase/kinase 

rr'nrrdaktseredgisydfirsysceylnwkklgpmlkllkvgitgdlscgkteacqvf 
oelcaywsade i5hsfli phtricrrvidllgsdvwdgafdaqaiaakvfynsvllqg 
leailhpevcriieeqyhqsiqdgnyplfvaevpllyeihyakwfdsvilvmanedirre 
rfmkktgrssedfdqrcsrfuweeklaoadwvenngtkkelhqk i eeyf^alkgal 

CV>rtJ)hl^ /0-lhRH 7iV022 

polA-DNA P(j Lymt^t.u-t.' [ 

Hi.' I fIT;jLUJ7'yERPRREY'\MKKLFVLDA;*0rirR.\YFALPEMKNH0GQATQAVFGFIRCL 
NKLI KEF3PEYM L3VFDGPNNKQ3RQA lYADYKSNP^KKFED I PPQ IALVKEYCSLIGLA 
Yt .EKE.*:VEADDV IAS I AK K AR E ENY KVY VCT A DK D L LO L VNDH V VAWN P WADUG VVG [ 3 E 
V [ EP.Yr :i PptiN t PDYLALVGD33DN I PGLRVOPKKAAALLKQFC5VEGLLENLDAVKGL 

:;utmu:erqetlkl:*kklalld:^ipipvptcsltfpqhpvdeeklihfyioqgfktlvp 

:;K0TEAATVOVrj \ I KDAK: XTN t LNLVOC^D E ArAVAYTGNHLLCLKLECLALTQGCGVF 
Ft ALEEECITK t Lf ' I I»Kni-TI,REDLT r YJYNLKRDCHALLNA(I [ V I RE [ 3 Y DLALAEHLTN 

< x \( ;k r .* ; fq: : i .wm\ ic rr rrrAi i r fak cwon l p iop lp co ^eqy fg e h/ayl p r i kda r l 

K!:! Tf IPKNIiNH [ L,:^D [ FlMlM.LKVLfSMLR-V .VTL,DVrnL,A 1 1, 1 lALI'ETCLAVLTEEI YDL3 



CPn_0S L 1 



705h^2 



0 1 h 5 » 



KTAP 1 1 AV I EMK DV I A33 KNTAKT WN I « EG r EKAF LKD RV KG I V I DMDC PGGEVF E I DR 
lYSMLRFWERKGFPIYI'/VNGLCASGGYYVSCAATKrYATSSSLIGSIGVRSGPFFNVK 
EGLNRYGVESDLLTAGKDKAPMNPYTPWTSHDREERQATLX)FLYGQFVDIVTQNRPtXTK 
EKLVHTLGAR I FS PEKAKQEGY I DWG AT K EQVLC/D I VA VC K I EDNYRV I G S GGDGWWKR 
VASAAASSPLVTGMIKHDILPLSHDAAYI PPYLAL 

CPn_0614 707435 705783 

adt-ADP/ATP Trans locase 

VFIRliKVGKEFMQSSEVKPFSRLJ^YLCPIYKSEFSKFVPLFLIAF 
TLVIVGSDAGAEVI PFLKVWG IVPGAVIVTMVYGWI^SRYPRDTVFYCFMAAFLGFFFLF 
AVIIYPVGDSLHLNSLADKLQEIXPQGLRGFIVMVRYWSYSIYYVMS 
GLANQITTITEAGRFYALIirTCLWLSSICAGEISYWMGKQTF^ 

MLITCSGLIMIWLYRR IHHLT I UTS I PPS RRVLAEEGAATANLKEKKKPKAKARNLFLHL 
IQSRYLLGLAIIVI^YNLVIHLFEVVWKDQVSQIYSSHVEFNGYMSRITTLIG^ 
VLLTGQC IRKWGVJTVGALVTPLVMLVSGLLFFGT I F AAKRD I S I F<KT/XG«TPLALAAWT 
GGMQNVLSRGTKFTFFDQTKEMAF I PLS PEDKNHGKAAI DGWSR IGKSGGSLIYQGLLV 
I FSSVAASLNVI ALVLLI IMWWI AWAY IGKEYYSRAADAVATLKQPKEPSSS IVREAQ 
ESVEQEEMAVL 

CPn_0615 708149 707634 

pgsA-Glycero 1 - 3 - P Phosphat idy It rans f erase 

LAKI MRQFCNLLSLSRLWLALYFCQEKLH I RLLAIVGAMLS DVLDG YLARRYKATS RLGS 
I LDP ITDKVFVFVC ITVLYMEGS LS I AHLF F ICARDLFL 1 1 FVCYL S LVKGWKGYDYGSL 
FWGKI FTWQFI ILLGVTAGGEI PWTGLVPLVALGFLYFLERIMDYKKQFLR 

CPn_0616 708704 710137 

dnaB-Replicative DNA Helicase 

TLTNYES SLLMDKSTG VPLP S P PHS KES EM I VLGCMLTGVHY LNLAANQLYE EDFYYLEH 
K 1 1 FR VLQDAFKQDK P I DVKLAG EEL KRHNQ I TV I GG P S YL I TLAEFAGTAA YL EEYVD I 
IRSKSIIJIKMISTAKEIEKRALEQPKNVAEALDEAQNSFFKISQSTSVSQYTLVADKLRG 
LTTTTDK PY LVQ LQERQEL FLONAQG E*JKS FFTG IPTHFIDLDQLIHGFSP SNLM I LAAR 
PAMGKTAIJULNIAE^CFQNRLPIGIFSLEMTVDQLIHRMICSRSEVtlSKKISIGDLSGH 
DFQRIVSVINEMQEHTLLIDDQPGIJCVSDLLPJ^RARRi^KESY^ 

RATESRCTE ISEISRMLKTLARELNI P ILCLSQLSRKVEDRANHRPMMSDLRESGS IEQD 
SDLVMFLLRREYYDPNDKPGTAELI I AKNRHG S IGS VPLVFEKELARFRNYS AFEC IS 

CPn_0617 710481 712316 

gidA-FAD-dependent oxido reductase 

LMWTHPIAYDVI\A/GAGHAGCEAAYCSAKMGVSVI^LTSNLDTIAKLSCNPAVGGIGKGH 

IVREIDALGGIMAEVTDQSGIQFRILNOTKGPAVRAPRAQVDKQLYHIHMKRIXEOT 

HIMQATVESL1^KEGVISGVTTKEGWMFSGKT\^SSOTFMRGLIHIGDRNFSG^ 

SSC^LSEDLKKRGFPISRLKTGTPPRIXASSINFSCMEEQPGDLGVGFVHRTEPFQPPLP 

QLSCFITHTMEKTKAI I SANLHRSALYGGC I EGVGPRYC PS IEDKIVKFSDKERHHVFLE 

PEGLHTQEI YANGLSTSMP FDVQYDM I RS VLGLENA I ITR PAYAI EYDY IHGNVIHPTLE 

SKLIEGLFLCGQINGTTGYEEAAAQGLIAGINAVNKVFNRPPFIPSRQESYIGVMLDDLT 

TQIIJ3EPYRMFTGRAEHRLU.RQDNACARLSHYGYEIX3LLSEERYELWKQNQLLEEEKV 

RLQKTFRQYGQSWSLAKALSRPEVSYDMLREAFPNDIRDLGAVLNASLEMEIKYSGYID 

RQKILIQSLEXAESLLIPEDLDYKQITALSLEAQEKLAKFTPRTLGSASRISGIASADIQ 

VLM I ALKKHAHH 

CPn_0618 712300 713010 

lplA-Lipoate-Protein Ligase A 

KNMPTTNCI FLDLRGHS ILHQLQIEEALLRVANQNFCI INSGAKDS I VLG I S RNLNQDVH 
ISRAQADHIPIIRRYSGGGTVFIDS^1JWSWIMNSSEASA0PQELIJ\WTYGIYSPLLPN 
TFS IRENDYVLGHKKIGGNAQY IQRHRWVHHTTFLWDIDLDKLSYYLPI PQQQPTYRNQR 
SHEEFLTTLRPWFPSRDDFLERIKASGSLLFTWEEFLDNELEEILAQPHRXATTVLN 

CPn_0619 713462 713013 

ndk-Nucleoside-2-P Kinase 

RR YVYTMEQT L S 1 1 K P DS VS KAH I G EI L S I FEQSGLR I AAMKMMHL SQTEAEGFYFVHR E 
RPFFQELVDFMVSGPVWLVLEGANAVSRWRELMGATNPAEAASGT IRAKFGES IGVNAV 
HG SDTLENAAVE I AYFFSK IEWNASKPLV 

CPn_0620 714145 713519 

ruvA-Holliday Junction Helicase 

DKKYDYIRGTLTYVHTGAIVIECQGIGYHIAITERWAIECIRALHQDFLVFTHVIFRETE 
HLLYGFHSREERECFRILISFSGIGPKLALAILNALPLKVLCSWRSEDIRALASVSGIG 
KKTA£KLM\ r ELKQKLPDLLPLDSRVET3QTHTTSSCLEEGIQALAALGYSKIAAERMIAE 
AIKDLPEGSSLTDILPIALKKNFSGVNKD 

CPn_0621 714707 714144 

ruvC-Crossover Junction Endonuc lease 

L3RLGSSrKDNKFKVF0ESIVS ELI IGVDPGT IVAGYAI IAVEQRYQLRPYSYGAIRLSS 
DMPLPMRYKTLFEQLSGVXDDTQPNAMVLETQFWKNPQ3TMKLAMARGIVLLAAAQRDI 
LIFEYAPNVAKKAWGKGHASKRQVQVNP/3KILNVPEVLHPSNEDIADAFALAICHTHVA 
RSPLCGVR 

CPn_0f,.»2 7LS761 7147*n 

CTOOj hypoLhoric.il protein 

RY'J'/PLLIj I LKLHLFSLRSS.^CLra f'[-tYYH.'''<.^'R3MLHLLCRWKDAD IMEWQQ [CMI LSGV 
< ':PM f ;GKr J V:'L,OKLTOD^NIOEHERrHLQYRCOL3ALEEEYRRREEAKnODL£KLCX3ENT 

wLOfJp r .akkloo r ri losn 1 1 nc r kk hllo^vof-cte r r; e^rp lcyf.i t k i koleeqloryvs 
cj m t i a t ' : ; 1 1 : t c edk 3 ■ : aa y \ i : l mp lk k :j l i d loo ek d r y r kt y mi e i a k lr ek lq ro egao 

•V : :[:V^^*inKLT[:VnTD[AL-:KKKA[ALLQLUVi:DOY<-or J i'DLIIKEK(-MAMPrNTKLDHLK 

';Li>;Ki.['i:^RvrjWF.u:::K.:[^'; 

' 'Pn_nM» i ' U!j 1 [ / I M t , t 

' TS'M hyporh* t l. ,i ! Pt .join 

I YUYVY FTP DPVrr.TV t !' :PK;YKL';7l?NTKHP f ;00f'FMVRA [KVI:;L/;N [( 'FFRNCD 
H:;K I'fLVI'AiTDYPVMrvm^l [ tlf ,KAV It .1 IVK I AC)f /PEAL I KLTK'jTPLPV [ DEKPLA 

\y.\-\ v<vv\ ;kki:kk3 \kf r,: ;i- ?< jKKWKCKKKi.nppp! ihkh r afvtoa:;oe [ r.DTVK 
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F.E LW E E; 'Q EN E I VEQK K F3 L L P P PAK L 1 3 EV I jQTWD P WT SADLNE3 LQ ALVRESS D L 
[NALL3ADDAIHFPETEEEPT3A3FEE33AMFFPETSSATEEE 

CPn_0(i24 718018 717011 

<?apA-Glycera Ldehyde-3 -P Dehyrogenase 

AMKWI NGFGR I GR LVLRQ TLKRNS3VEYLA I NDLVPGDALTYLFKFDSTHGRFPEDVRC 
EADHL I VGKRK IQFLS ERNVQNLPWKDLGVDLVIECTGLFTKKEDAEKH IQAGAKRVLIS 
A PCKCD T PTFVMCVNHKT FN PF.K DFV 1 TNASrTTNCLAP I AKVLLDNFGTTEGLMTTVHA 

\ v\\\-i :r:u :\ .:v r i wk;M" ;» L»„r; . i a./p ,a, r^vruxt- rLK f .KL* TY .MArRVP [et. / 

.:VVLLT7RLLI-\;';TVtJDI^f''vAMKv'.. J L:[ , L'LKf ; TDCQVYJ.-'Dr IG3 C r .73 [ FDAL/V j 
lAU^RFFKLVAWYDNETGYATRIVDLLEYVEKNSK 

CPn_0625 718488 718060 

rll7-L17 Ribosomal Protein 

VMQHARKKFRVGRTSSHNRCMIJ^MLKSLIHYERIETTLP 
LAARR IAIGRIJIVRYNKLTSKEARQAKGGDTSVYNVDRLWN^ 
RILKLQNRIGDNAQKCI IEFLAS 

CPn_0626 719670 713495 

rpoA-RNA Polymerase Alpha 

V^PAKKKAQSWI^KEKGMSDNAHNLLYDKFELPEAVKMLPVEGLPIDKHARFIAEPLER 

GMGHTI/jNALRRALLIGLEAPAIXSFA>TIX3VLHEYMAIEGVIEDV^ 

PMQDS S LGRTTQVLKAS I S I DASDLAAANGQKEVTLQDLLQEGDFEAVNPDQVI FTVTQP 

IQLEWLRIAFGRGYTPSERIVLEDKGVYEIVIJDAAFSPVTLVNYFVEOT 

LVLIVETDGRVTPKEALAFSTQILTKKFSIFFJ^MDEKKIVFEEAISIEKENKDDILHKLI 

LGINEIELSVRSTNCLSNANIETIGELVIMPEPRLLQFRNFGKKSLCEIKNKLKEMKLEL 

GMDLTQ FGVG LDNVKEKMKWY AEK I RAKNTKG 

CPn_0627 720059 719640 

rsll-Sll Ribosomal Protein 

FL I RSRVLVKNQAQAKKSVKRKQLKN I PSGWHVKATFNNT I VS ITDPAGNV I SWASAGK 
VG YSGS RK S S AF AATVAAQ DAAKT AMNSG LK EVEVC LKGTG AGRES AVRAL I S AGL WSV 
IRDETPVPHNGCR PRKRRRV 

CPn_0628 720461 720063 

rsl3-S13 Ribosomal Protein 

DAYTILREAQRMPRIIGIDIPAKKKLKISLTYIYGIGSARSDEIIKKIJCLDPEARASELT 

EEEVGRIjNSIXQSEYTVEGDLRRRVQSDIKRLIAIHSYRGQR^ 

RKGKRKTVAGKKK 

Cfg|0629 721881 720487 

s^CY-Translocase 

K iplFRPYMTTLRQFFLITELRQKLFYTFALLTACRVGVF I PVPG INGELAVAYFKQLLG 
SG#LFQLADI F SGG AF AQMTV I ALG WPY I S AS I IVQLFLVFMPALQREMRESSDQGKR 
RI-sSaJ'RLFWAIJWTQSLLFAXFALR 

TTGTLLLMWIGEQISDKGIGNG I SLI IALGILSSFPSVLGSIVNKLNLGSQDSSDLGLIS 
I S &EALVFVFVL ITT I L 1 1 EGVRKI PVQYARRVIGRREVPGGGSYLPLKVNYAGV I PVI F 
ASSfcLMFPATTGQFI ASESSWMKRIAALLAPGSLVYS ICYVLLI IFFTYFWTATQFHPEQ 
iAS^lMKXNNAFIPGIRQGKPT^ 

S^ELGG TAMLI WGVVLDTMKQ VDAFIiMRRYDSVLKTDRT KG RH 

C|g0630 722316 721885 

rfl5-L15 Ribosomal Protein 

M iSGESLFD I SERKRRKKLU3RGPSSGHGKTSGRGHKGDGSRSGYKRRFGYEGGGVPLYR 
RVPTRGFSHKRFDKCVEEITTGRLAELFQEX3EAITLDALKAKKAIARQAVRVKVILKGDL 
EKJT FVWQDT AWL S QGVQNLLG IT 

cfe:0631 722812 722312 

rgSrSS Ribosomal Protein 

e^slsknshkedqleekvlwnrcskwkggrkfsf 

ti|a£rkggeaakknlmkieai£ix3siphe^ 

emaGikdivaksfgsnnpmnqvkaafkaltglsprkdllrrgaaind 

C^l0632 723354 722827 

rl ; li-L18 Ribosomal Protein 

KG IJ IS S WLVNLLQ VF APNVLLN L I KVR EFVMKMNMS WK LVKL RKQ AKNR S R VM ESS LC K 
KSlijEKRRRALRVRXVLKGS FTKPRLSVW 
GLT'KKNQEVAJO/IJGTQI AEIJ3KNLQLDRVWDRG PF KYHGI V 

CPn_0633 723760 723209 

rl6-L6 Ribosomal Protein 

SMSRKAREPILLPQGVEVSIQDDKI IVKGPKGSLTQKSVKEVEITLKDNSIFVHAAPHW 
DR PSCMQG L YWAL I SNMVQGVH LGFEKRLEMI GVGF RASVQGAF LDLSIGVSH PTKI PIP 
STLQVSVEKNTL I SVKGLDKQLVGEFAAS IRAKRPPEPYKGKG I RYENEYVRRKAGKAAK 
TGKK 

CPn_0^34 724215 723787 

C38-S8 Ribosomal Protein 

ESSIKRKRIYMGMTSDSIADLLTRIRNALMAEHLYVDVEHSKMREAIVKILKHKGFVAHY 
LVKEENRKRAMRVFLQYSDDRKPVIHQLKRVSKPSRRVTT/SAAKIPYVFGNMGISVLSTS 
QGVMEG SLARSKN IGGELLCLVW 

CPn_0635 724763 72420b 

rl5-L5 Ribosomal Protein 

GERKANMSRLKKFYTEEIRKSLFEKFGYANKMOIPVLKKIVLSMGLAEAAKDKNLFQAHL 
EE LTM I 3GQ K P LVT KA RNS I AG FKLR EGQG I G AKVT LRG I RMY DFMDR FCN I VS PR I RDF 
RGFGNKGEGRGCYSVGLDDQQIFPEINLDRVKRTQGL^ITWVTTAQTDDECTTLLELiMGL 
RFKKAO 

CPn_Ob^.f; 725 LOO 724750 

rl24-L2'1 R Lbosonui I PiofeLn 

FK EKEVMKKON I RVCDKVF [L^CNDKCKEGKVLCLTEDKVWECVNVRI KN I KRSQQNPK 
< JKR [3 IEAP ni [3WRLTrAC;EPAKLJVKVTE0GRELWQRRPLf7r30LYRLVRGKKG 

'.'('rij)t'. H ;j'>47 i 7.> 1 jO'J'> 

riM-Lhl KilKir.oinaL iTort-iti 

I M [-M L (J(j F.30 1 .K VADNTC 1 A K K V K l ' F K VLGG 3 R R R Y AT VC f )V rVC3VRDVCPN3:;iKKGDV 
I KAVI VftTRftl! ITRKDCrTLKFDTN^CVT [ DDKCNPKCTP rFCPVARE tRDRCFTKISSL 

apkv i 

f'l'li_«i. 1H /.»',// I 7J')4'>{) 



rsl7-3l7 Ribosom,iL Protein 

NKKEKVKSMA3EPPC3RKVK IGVWSAKMEKTVWRVER I F3H PQYLKWR3->KKYYAH . 
ELKVSECDKVK IQETRPLSKLKRWRV I EHVGWS 

CPn_0639 725979 725743 

r!29-L29 Ribosomal Protein 

ASGKGINMAAKKDLLTQLRGKSDDDLDAYVHE^KALFALRAENLI^N^yKVHMFSTHK 
KN T ARALTVKOEPKGJCVHG 



rllo-Llts Ribosomal Protein 

1 1 KLMPKRTKFRKQQKGQFAGLSKGATFVDFGEYAMQTLERGWVTSRQ I EACRVAINRYL 
KRRGKVWIR I FPDKS*vTKKPAETRMGKGKGAPDHWVAWRPGRILFEVANVSKEDAQDAL 
RRAAAXLG I KTRFVKRVERV 

CPn„G641 727092 726409 

rs3-S3 Ribosomal Protein 

KGRRIMGQKGCP IGFRTGVTKRWRSLWYGNKQEFGKFLI EDVR I RQFLRKKPSCQGAAGF 
WRRMSGKI EVT IQTAR PG LVIGKKGA£VDLLKEELRALTG KEVWLE I AEI KR PELNAKL 
VADN I ARQ I ERRVSFRRAMKKAMCSVMDAGAVGVK I QVSGRLAGAE I ARS EWYKNG RVPL 
HTLRADIDYATACAETTYGI IG I KVWINLGENSSSTTPNNPAAPSAAA 

CPn_0642 727440 727096 

rl22-L22 Ribosomal Protein 

RR H SMFKAT AR Y I RVQ P RKARLAAG LMRNLSVQ EAE EQLG F S Q LKAG RC LKKVLNS AVAN 
AELHENI KRENLSVTEVRVDAGPVYKRSKSKSRGGRSP I LKRTSHLTVIVGEKER 

CPn_0643 727725 727450 

rsl9-S19 Ribosomal Protein 
EIRIbKRSLRKGPFVDHHLLKKVRAMNIEEKCTPIKTO 
FLTVFVSETMVGHKLGEFSPTRIFKSHPVKKG 

CPn_0644 728594 727722 

rl2-L2 Ribosomal Protein 

FIREINSMFKKFKPVTPGTRCLVLPAFDELTTRGELRCrrKSKRSLi?PNKKLSFFKKSSGG 
RDNLGH I SCRHRGGGAKQLYRWDFKRNKDG ITAKWTVEYD PNRS AYI ALLSYEDGEKR 
YILAPKGIQRGDVWSGEGSPFKPGCCOTI^SIPLGLSVHNIEMRPSSGGKLW 
QVIAiCSPGYVTLKMPSGEFRMLNEGCRAT IGEVSNADHNLRVDGKAGRRRWMGVRPTVRG 
TAMNP\TOHPHGGGEGRHNGYIPRTPWGK\/TKGIiOT 

CPn_0645 728933 728598 

rl23-L23 Ribosomal Protein 

DMKD PYDV I KRHYVTEKAKMLEHLSAGTGEGKKKGS FCKDPKFVF I VSHDATKPL I AQAL 
EAIYVTJK^^V^<VKSV^^^I^^/KPQPARMFRGR 

CPn_0646 729636 728950 

r!4-L4 Ribosomal Protein 

YREDLMVLLSKFDFSGNKIGEVEVADSLFADEGDGLQLI KDYIVAI RANKRQWSACTRNR 
SE^'SHSTKKPFKQKGTGNAROGCLASPOFRGGGIVFGPKPKFNOHVRINRKERKAAIRLL 
LAQK I G/ITfiCLTVVDDTVFVDALiT APKTQ S ALR F LKDCNV EC R S I L F I DH LDHVEKNENLR 
LS LRNLTAVKGFVYG ININGYDLASAHNI V I SKKALQELVERLVS ETKD 

CPn_0647 730490 729657 

rl3-L3 Ribosomal Protein 

YLEYFSYCKMjPPLITCPFIFLRENFLEFLENSISKILSRFVSLFLQEESKSLLLMDKFM 
RSH I SVMGKKEGMIHIFDKDGSLVACSVIRVEPNWTQ IKTKESDGYFSLOIGAEEMNAP 
AHT I TKRVS KPKLG H LRKAGG R VFRFLKEVRG S E EALNGVS LG DAFGL EVFEDVSSVDVR 
GISKGKGFQGVMKKFGFRGGPGSHGSGFHRHAGSIGMRSTPGRCFPGSKRPSHMGAENVT 
VK^EVIKVTJLEKKVLLVKGAIPGARGSrVrVKHSSRT 

CPn_0648 731636 730605 

CT529 hypothetical protein 

FFFKKPCKEVKMATNAIRSAGSAASKMLLPVAKEPAAVSSFAQKG I YC IQQFFTNPGNKL 
AK FVG ATK S LDKC FKLS KA VS DCWG S LE EAGCTG D ALT S ARNAQGMLKTT R EWALANV 
I^GAVPSIVNSTQRCYQYTRQAFELGSKTKERKTPGEYSKMLLTRGDYLLAASREACTAV 
GATTYS AT FGVL R PLML I NKLT AKPF LDKATVGKFGT A VAG I MT I NHMAGV AGAVGG I AL 
EQKLFKRAKESLYNERCALENQQSQLSGDVILSAERALRKEHVATLKRNVLTLLEKALEL 
WDGVKLI PLP ITVACSAAI SGALTAASAG IGLYS IWQKTKSGK 

CPn_0649 732672 731710 

fmt-Methionyl tRNA Formyltransf erase 

LN LKVVYFGT PTFAATVLQDLL.H HK I Q IT AWTRVDK PQKRS AQL I PS PVKT I ALTHGL P 
LLQPSKASDPQF I EELRAFNADVF I WAYGA I LRQ I VLD I PRYGCYNLH AGLLPAYRGAA 
PIQRCIMEGATESG^^^VIR^^DAGME^!'GD^IANITRVPIGPD^1TSGELADAI^SCXJAEVLIK 
TLQQI ESGQLQLVSQDAALATI APKLSKEEGQVPWDKPAKEAYAH I RGVTPAPGAWTLFS 
FSEKAPKRLMIRKASLLAEAGRYGAPGTVVVTDRQELAIACS EGAICLHEVQVEGKGSTN 
SK5FLNGYPAKKLKIVFTLNN 

CPn_0650 733513 732665 

lpxA-Acyl-Catrier UDP-GlcNAc O-Acyltransf erase 

S RRNMAS I H PT A 1 1 E PGAK IGKDW I E PYW I KATVTLC DNWVK S YAY I DGNTT I G KGT 
TIWPSAMIGNKPCDLKYCGEKTYVTIGENCEIREFAIITSSTFEGTTVS IGNNCLIMPWA 
HVAHKCT IGNNWLSNHAQLAG HVQVGDYA I LGGMVGVHQFVR IGAHAMVGAL SGIR RDV 
PPYTIGSGNPYQLAG ENKVGLQRRQVPFATRLALIKAFKKIYRADGCFFESLEETLiEEYG 
DI PEVKNF lEFCO^ PSKRGI ERS I DKQALEEESADK EC/LIES 

CPn_065l 7 33975 733517 

tabz Myristov I - \cy 1 Carrier Dehydratase 

MNQP3VIKLRELLDLLPHRYPFLLVDKVLSYDIEARSITAQKNVTINEPFFMGHFPNAPr 
MPTJVLI LEALAOAA»JVLICLVLEtJDRNKR IALFLG IQKAKFRQAVR PGDVLTLQADFSL r 
33K^;KAWAOARVnSQ[ J VTEAEL3FALVDKE3I 

rptijn,^?. 7 14SS30 733990 

Lpx<, Myrisroyl OleNac L^icerylaGe 

KRN" [ [ Y^D3L3r,l- YM[,ER'rORTLKREVRY3f]VGIHLVJK3STLHL0PA0TNT^ [VFQRQ3 
MX ;>fY ENVPALL iPHVYTIv ] R: ;TTL3R^3 AV IATVEHLMAALRSNN EDNL [ TO^.Tf'EE T P I 
CDC';'; NVFVKL I LVAG LCE0EDKV3 rARLTPPVYYQH0DIFLAAFP3DELK [3YTLHYPQ 
:;';TrCTOYK::LVrNEE3rROt:iA^'RTrALYNELCrU-lEKGLIGGGCLDNAVVFKDDCI I 
SRCijLRFADEiTVRHK [ LHr, LCDLr:r;/CRPFVAKVLAVC3GHS3NIAFGKK T LEALI'L 

Cpri_0'.Si /ttvlVi / 
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cur.E-Apo L ipoprotein N-Acety It runstenise 

( 3 E PVLR f FC FV T SWCL r AF AQ P DLSG FVS I LGAACGYG F FWYS L E P LKKPS L PLRTLFVS 
CFFWI FT I EG r H F3WMLSDQY IGKL I YLVWLTL IT I LSVLFSGFSCLLVAIVRQKRTAFL 
WSLPGVWVA I EM L R FYG I FSGMS FDY LGW PMT AS AYGRQ FGG FLGWAGQ S FAV IAVNMS F 
YCLLLKK PHAKMLWVLTLLLPYTFGAI HY EYLKHAFQQDKRALRVAWQPAH PPIRPKLK 
S P E WWEQ L LQL VS P I QQ P I DLL I F P EWVP FGKHRO VYPYSSCAHLLS S FAPL PEGKAF 
LSNSDCATALSQH FQCPVI" IGLERWVKKENVLYWYNSAEVISHKGISVGYDKRILVPGGE 
V T r> K FG f r , T ^ t> OT • F PKY A r r,r K R L PGR R ^rr/^/CVRGLPR I G I T T EETFGYR LQSYK 

->v.< ;AK;, t .v:i[/rrjr/,w/'! < m :, f 'KviiFr,ir;MLi j rof w vpa< "c^vz-t-aav nriLGP :lk 

i LPYLjTHETKAI-./JVLLT ^LfLFNYKTLYG'i C,L ( PMlLiArCAVGYI/JGOrLGYPU-AK 
KEIR 

CPn_0654 737051 736503 

vdlD/yciA-acy 1-CoA Thioesterase 

KKIIDFLSVDRYYRNQEYPIKILSVESTMLXKKPVSFSCIIX^HIYKIFP^IDL^IAN^^VFG 
GL LM S L LDRLAL WAE RHT E SVCVT AFVD ALR FY A P AYMGENL t CKAAVNRTWRTS LEVG 
VKVWAENI YKQERRH ITSAYFTFVAVNEDNQP I PVHQIVPETPEEKRRYNEADRRRQARL 
ELK 

CPn_0655 737856 737101 

dnaQ-DNA Pol III Epsilon Chain 

KEIMSLLKDTVFTCLDCEMTGLDVKKDRIIEIAAVRFTFDSVISSIEFLINPERWSAES 
QRVHHISNAMLRDQPKIAEWPQIKAFFKFjGDYIVGHSVGFDLQVIJ^EMERIGETFLSK 
YT I IDTLRI^EYGDSP^INSLESIAVHF^A^YIX;NHRAMKDVEININIFKHLCKRFRTLE 
QLKQVLAKP IKMKYMPLGKHKGRCFSEI PLAYLQWASKMDFDSDLLFSIRHEIKHRQKGT 
GFSQVNNPFMEL 

CPn_0656 737842 738048 

No robust homolog present in Genebank/EMBL as of 11/7/93 

THNFLLLPLSLFDILLTVEGFLCLTLYFASVQRMPCEQKRVPGNLYYYYIAAHSSLCLSV 

CKDTMENKD 

CPn_0657 738476 738051 

yjeE (ATPase or Kinase} 

PMGRYRRVSHSSOETLLI^TEI^VLVPGAVLU.FGDYGAGKTEFVRGIVSGYLGDTIAE 
EVASPSFS I LHVYGNEPKRLCHYDLYR I DQKNQEY I FQDAEEDDVLC I EWADRLPKPRFC 

■ DTINIYITMQTNMEREIIIEKR 

CPn_0658 739180 738455 

CT538 hypothetical protein 

KRVGMD I SGAVKQKLLQFLG KQ KK P E LLATYL FYLEQALS LR PWFVRDK 1 1 FKT P ED AV 
RS&EQDKKIWRETEIQISSEKPQVNENTO 

PCJS1$£EKQGGVRIKRFLVSEDPDVIKEYAVPPKEPIIKTWASAITC 
S&Y^PMTLEEVQNQTKFQLESSFLSLI£DALVEDKr^ 

C|al0659 739482 739838 

trixk-Thioredoxin 

I^S|llTODSNSIFREGKLMVKIISSENFDSFIASGLVLVDFFA£^GPCRMLTPIL 
ELHOTIGKINIDENSKPAETYEVSSIPTLILFKTCN^ 

CBbe^0660 740327 739860 

sp^gr-rRNA Methylase 

MRWLHC PD I PQNTGN IGRTCVALGAEL I LVR PLG F S IJUDKFVKRAGMDYWDKLQLTVVD 
S^EALHDVPEDQIFCLSTKGSASYTEFSLPSSGTYVFGSESKGLPKEIIJCKYYKNCLRI 
P1|0QDI RSLNLAT SVG I VLYEVVRQKTVALQKNPTV 

cfe0661 741139 74 0327 

mxp r FKBP-type pept idyl -prolyl cis-trans isomerase 

HSR^LK I KD RRR KMWRWNL VLATVALALSVAS C DVRS KDKD KDCGS LVEYKDNKDTND I 

E^ONQKLSRTFGHLLARQLRKSEDMFFDIAEVAKGLQAELVCKSAPLTETEYEEKMAEV 

QKWEKKSKENLSLAEKFLKENSKNAGVVEVQPSKLQYKI I KEGAGKAISGKPSALLHY 

KGSF INGQVFSSSEGNNEPILLPLGCTI PGFALGMQGMKEGETRVLYIH PDLAYGTAGQL 

PPHgLL I FE I NL I QAS ADEVAAVPQEGNQG E 

CP%y3662 742938 741172 

as^aSrAspartyl tRNA Synthetase 

SKGeVMKYRTHRCNELTSNHIGENVQLAGWVHRYRNHGGVVF IDLRDRFG ITQIVCREDE 
QPELHQRLDAVRSEWVLSVRGKVC PRLAGMENPNLATGH I EVEVASFEVLSKSQNLPFS I 
ADDH INVNEELRLEYRYLDMRRGDI I EKLLCRHQVMIACRNFMDACGFTEIVTPVLGKST 
PEGARDYLVPSRIYPGKFYALPQSPQLFKQLLMVGGLDRYFQIATCFRDEDLRADRQPEF 
AQ I DIEMSFGDTQDLLP I I EQLVATLFATGG I E I PLPLAKMTYQEAKDS YGTDKPDLRFD 
LKLKDCRDYAKRSSFS I FLTOLAHGGTIKGFCVPGGATMSRKQLDGYTEFVKRYGAMGLV 

■ WI KNQEGKVASNIAKFMDEEVFHELFAYFDAKDQDI LLLIAAPESVANQSLDHLRRLIAK 
ERELYSDNQYNFVWITDFPLFSLEDGKIVAEHHPFTAPLEEDIPLLETDPLAVRSSSYDL 
VLNGYEIASGSQRIHNPDLQSQIFTILKISPESIQEKFGFFIKALSFGTPPHLGIALGLD 
RLVMVLTAAES I REVIAFPKTOKASDLMMNAPSEIMSSQLKELSIKVAF 

CPn_06t>3 744220 742901 

hisJ-Hist idy 1 tRNA Synthetase 

KSNHFERRHHVTVTLPKGVFD I FPYLADAKQLWRHTSLWHSVEKA I HTVCMLYGFC EI RT 
PIFEKSEVFLHVGEESDWKKEVYSFLDRKGRSMTLRPEGTAAWRSFLEHGASHRSDNK 
FYYILPMFRYERCXJAGRYRQHHQFGVEAIGVRHPLRDAEVLALLWDFYSRVGLQHMQIQL 
^IFLGGSETRFRYDKVLRAYLKESMGELSALSQQRFST^A/LRILDSKEPEDQEIIRQAPPI 
LDWSDEDLKYFNE ILDALRVLEI PYAINPRLVRGLDYYSDLVFEATTTFOEVSYAIjGGG 
GRYDCLICAFGGASLPACGFCVGLERAIQTLLAQKRIEPQFPHKLRLrPMEPDADQFCLE 
W:;OHLRR[XlIPrEVDWSHKKVKGALKAASTEQVSf"/CLIGERELISQQLVIKNMSLRKEF 
FGTKEEVEQRLLYEIQNTPL 

'.TtiJHhmI M4775 744557 

No mbwu homolog present in Genebank/EMBL as or 1 1/7/08 

lwfahamkkl i al ig i flvp t kontnkehdahatvlkaarakynlffvqdvfpvhevi ep 
i:;pdclviiyegwv 

1 l'li_(H.i.'» M4')'»H 71t,3bS 

uii\Al HfXo::phcv-.ptuir »> Tr.jrmport 

KMNVWTKKF^PPKH IKF t EDQEWKKKYKYWR L R I F {'MF rGY [ 1'YYrTRKCFTFAMPTL 
tM/l* ;FDKA(jU:i IG. JTLYFSYG [.1KFV f jGVMSDQ r ;NPRYFMA [GLMrTGLTNrFFGMSS 
\'A VM-'AI-WW ;i.N( ;WFf>;Wt;WPPCAPLLT!fW'i'AK^E^^TWW:JVW:;T^HNIGGALrPILTGF 

i i DY:;f ;wi« JAMYVh ; i lc u ;mglvlinrlrdtpo-;[/;lpp r ekykrdphhahhegksase 
f :tkk r i-:s< .: :trk r LFTYvr .tnowi .wfi.aaa:^ i y r vpmavndwsalfl r etkhyaavk 



ANFCVSLFEIGCLFGMLVACWLSDK rSKGNPGPMNVLFSLGLLFAI LGMWF3RSHNCWWV 
DGTLLFVIGFFLYGPQI^IGLAAAELSHKKAAGTASGFTGWFAYFGATFACYPLGKVTDV 
WGWKGF F I ALLACAS I ALLL FL FTWNAT EKNT RS KA 

CPn_0666 746370 750107 

dnaE-DNA Pol III Alpha 

GFFLTWIPLHCHSQYSVLDAMSSIKDFVAKGQEFG I PALALTDHCNLYGAVDFYKECTQK 
GIQPI TGCECYTAPGSPFDJ'KKEKRSRAAHHL I LLCKNEQGYRNLC ILTSLAFTEGFYYF 

ri- rPKDi-L.p'^ (TJE'iL!. !. - : ?' r-*"%....: ;,l^wi ol-lk ^'- r TC/;uiK 

m^lejiagfkec^lkvc;. - ' -v -"iO"HT7ArMDr.<- :;ja:jdwqah 

EI LLNVQSGETVRIAKQNTH I PNPKRKVYRSREYYFKS PAQMAELFKD I PEV ISNTLEVA 

krcdftfdfskkhypiwpeslktlnsyteedryqasavflkolasealpkkyssevlak 

IAKKFPHRDP IDIVKERl^DMEMAI 1 1 PKGMCDYLL IVWD I IHWAKANG I PVGPGRGSGAG 

SVLLFLUGITEI EP IRFDLFFERF INPERLSYPDI DI D ICMAGRERVINYAI ERHGKENV 

AQIITFGTMKAKMAVKDVGRTLDMALSKVNHIAKHIPDLNTTLSKALETDPDLH 

AE SAQV I DMALCLEGS IRNTGVHAAGV 1 1 CGDQLTNH I P I C I SKDSTMITTOYSMK PVES 

VGMIJamUjGLKTLTSINIAMSAIEKKTGQSLAMATLELDDATTFSU^ 

S KGMQELAKNLR PDLFEEI I AMGALYRPG PMDM I PS F I NRKHGKE 1 1 EYDH PliffiS I LKE 

TYG IMVYQEOVMQ I AGALASYS LGEGDVLRRAMGKKDFQCWEQEREKFCKRACNNG IDP E 

LAWIFDKMEKFAAYGFNKSHAAAYGLITYTTAYLKA^PKEWUVVI^ 

LIREAQSMGIPILPPHINVSSNHFVATDEGIRFAMGAIKGIGRGLIESIVEERDHHGPYE 

SIRDFIQRSDIJCKVSKKSIESLIDAGCFEXrFDSNRDLIXASVEPLYEAIAKDKKFAASGV 

MTFFTLGAMDRKNEVPICLPKDI PTRSKKELLKKEKELLG IYLTEHPMDTVRDHLSPXSV 

VLAGEFENLPHGSWRTVF I IDKVTTKI S S KAQKKF AVLRVSDG I DSYEL PI WPDMYEEQ 

QEIXEEDRL IYAILVLDKRSDSLRISCRWMKDLSIVNENI IYECDQAFDRIKNOVQKMSF 

TMSTSGKCTKAKGNKPNFJWHTQALAPWLSI^LNELRHSHLCILKKIVQKHrc 

VFTQ DN ERVASMS PDDAYFVC ED I EELRQEL VT ADLPVRV I TV 

CPn_0667 751097 750177 

No robust homolog present in Genebank/EMBL as of 11/7/98 

NISt^KIQKRYFMKKLILYFAAFVASLFCGVFLWDRVPCAQKIMRIJVAI^SSEVFSKSC 

RFVRKISGFEELQVFERHVSPEOAI^FPEYPJX5KSFVELAFIPHTIJ^ 

HIISQEGEILWSLVN3EKVIJmnwrCSKGF^ECIJ J IJ4AGKQI>IRV 

S LAQALALKN I RAERV I KEC QKKKL I FASGNQ I GTH TQQ FQ ? I RGCTTT LNNNPVWLQK P 

RHAAVFPAQYSEDRVRHLVKM I FGCM^FL I VRS SMVYVPVYK ISLVS ADNSVRVEYI NAVT 

GKSFQDL 

CPn_0668 751176 752162 

CT547 hypothetical protein 

WRFVWSPRLIMKFIXYVPLIiVLVSTGCDAKPVSFEPFSGKI^TQRFEPQ 

GO EFLKKGNF RKALLC FGIITHHFPRD I LRNQAQYL I GVC YFTQ DH P DLAD KAF AS YLQL 

PDAEYSEELFQMKYAIAQRFAO^KRKRICRLEGFPKLMNADEDALRIYDEILTAFPSKDL 

GAQALY S KAALLIVKNDLT EAT KTLKKLT LQFPLHILSS EAFVRLS E I YLQQ AKKE PHNL 

QYLHFAKLNEEA^1KKQHPNHPLNEVVSA^M3AMREHYARGLYATGRFY 

YRTAITNYPDTLLVAKCQKRLDR I S KHT S 

CPru0669 752140 752775 

CT548 hypothetical protein 

I EYLS I LPK I EI NMRLFSLGT I YLFFSLALS SCCGYS ILNS PYHLS SLGKSLLQER I F I A 
PIKEDPHGQLCSALTYELSKRSFAISGRSSCAGYTLKVEU^GIDKNIGFTYAP^^<LGDK 
THRHFIVSNEGRLSLSAKVQLINMTTQEVLIDGXTVARESVDFDFEPDLGTANAHEF 
FEMHS EAI KS ARR I LS I RLAET I AQQVYYDLF 

CPn_0670 752738 753196 

rsbw-sigma regulatory f actor-hist idine kinase 

PRRLLNRYTMTFFEGETVFPAVLSELHSMLDL IKRAGKQSKC PQEKLLKLELACEELLVN 
IISYAYQGENSPGTIAISCISHRGDLEWIKDHGPSFNPLiAVSINIQEDLPLEORKLGGL 
GIFLAKSSVDEFLYAREDHCNIVHLKMLNGOHS 

CPn_0671 753660 753205 

CT550 hypothetical protein 

RITINQRKYTMSLDFFEEFYHQSILi^^rcTSFPEGYLNIAEILSYPHCTDANTDFLCSQSD 
ND F 1 1 AES KDKLTLFNAD FA I WLVPELVCGG/A VT RG Y I A VSQG EGNY E P EMA F EASGQYN 
QS SL IL EALQLYLKDI KDTENALRS FRFNNDH 

CPn_0672 753723 755048 

dacF{pbp5) -D-Ala-D-Ala Caroxypeptidase 

T I KS PHMKRP FFTYLC 1 1 FYG SO AS LS LH AG LS F PEVRG AT AAWHADSGKVFYDKD I DA 
VI YPASMT K I ATALF I LKH Y PTVLDTL IKVKQDAI AS I T PQAKKQSGYRS P PHWLETDG S 
T IQLHLREEIXGVroLFHALLVCSANDAANVLAMACCGSVEKFMDKLNFFLKEE IGCTHTH 
FNNPHGLHHPNHYTTTRDL I S IMRCALKEPPFRGVISTTSYK IGATNLHGER ILSPTNKL 
LL PGSTYHYPPALGGKTGTTKTAGKNLIMAAEKNNRLLVT IATGYSGPVSDLYODVI ALC 
ETVFNEPLLRKELVPPSDCLOLEIANLGKLSCPLPEGLYYDFYASEDREPLSVSFIAHAD 
AFP I EQGDLLGHWVFYDDEGKK ISSQPFYAPCRFERTIKPWKLYMKRVFTSYRTYMS ITM 
LLMYFR I RKHRKYKNLKHY'SK I 

CPn_0673 755242 755463 

CT552 hypothetical protein 

GKSTEGKAYHCFLKQVS IALNREEVWDNPHHLMF I L MQFQQ F SG EQ DR FGS F L EAT I RDR 
VSFLVLQEKIATLK 

CPn_0674 7S6f>B9 755577 

fmu-RNA Methyltransterase 

RG I LYVTMV P FRQHHAYQL LKQ LHTS A I S EADRVS YY FKQNR S LG S KDRQW I QN 1 1 FN I L 
RHPRLLETLI LDSGEQVTPEALVAKVNEGVLENLDSYSA I PWPVR YS I SDDLAHFLVQDY 
GEEQAEEIAKIWLTEAPITIRVTITDKISVKELQEKLEYPSSPGELPEALHFSKRHPLQST 
EAFRRGFFE IODENSQRI SQG I "LTDKD I VLDFCAGACGK3L IFAOKAKHWINDSRKAI 
LQTAKH RL LRAC ARNF SLADQ L P LG S F GW I VDAPCSGTGV FR RH P EH KWQFS KKL LLNY 

VRVOKCrLKOASAYVGPRGRLWITr:;LLKEENEAHVAYMHSU3WKEVHRKTLPLOVGKG 
DAFFTSHFOKI 

^Pn_0h75 7S/»jL /5(.7n8 

CT^Oh hypothet LcaL pror> Lri 

VPL.' J MrLDFOFSrGYYL,KVLCLA[Hr/n , P ri-AYbRKKLLLDAWl'VNUI't^rrNYDT^VnTI 
PUVIHELFSW^AIliY^UT^RLl,/ I rHLPLULLK Wf ;W[.YRLFFP:;KYI I [ KKA tVDKLCM 
FK.^LELFEGKRPVDK [VQA.\NKVnKr;K iNP : : IW H D F'H f F/7TV: ' EVj'P P ! .At * E VOR R LAA 
DAGLQMI IEALTTLLEGHTAYLM.'ir.r^LIjNOP T f JEKAyl'f .KTL.TKK::YVt .I.R EL, [(JLFSL 
'JAEDFQTI TMSI I'ID.'jLSEVLA! I'JL fjf KJI'I tff IKTFVGLWOKTALASPKUMKI AU1FL 
AEVt.RKVTVEKKLHV^K^IiN'rrPEE^/Gni Y:' [ I'f>jNPALV/DKM PI'MLI ,MRWM ,I;YDRD [G 
TAI .P KAAEYYNPH PH FWROFLPLWQP p P 
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CPnjWo 7VU20 758051 

homologous Co CT695 

GMGTP 1 5GNDGDRNT ISDPLEE3AAEEGDGDLEDRVSESATQVI ET IADTGI PEATPSEG 
TNGDU^SDLVTJRVEYEARGSLLTTMLARIRi<AVSQrWMHVOT 

LLKATRLPKETAEPPYFYALETALASCR5FFFHVFLRLFTLLRRQHPEAPLDLCGTDPIS 
PEAAVAFAL I LRSCCKWVATDAVQEGLPLEVI EEAGMYKAFS LEATTTVEEVSKRLSELL 
YSDKR rDGLANVRGITKIITSPYLGAGCCVSVVDNLKTYDI^ 

KG EN FA LVM K D [ I ,Y X A/RQ DR n K FT /GDFLMMWS E EH AS EVNYDWLA I LEVNL P ILEEDYR 

.:ii!-[.AYvKK:.»Yvr 'ji-'i ; „ ; • t rv 

CPn_0677 760410 759256 

No robust homolog present in Genebank/EMBL as of 11/7/98 
RIA>K3INPSGNRSPDDVWVRGAG^DSSSTCXn*GATNSNLGAHNVTTSTSQPQVASKAKQL 
WQTVR E FFLGKKS PDS SQGASG PAMQSPSG PT IRPTRPAPP P PTTGGANAKRPATHGKGR 
APQPPTAGSSSGS EQPTAMS SEVAKLVS ELKDAVHS HAESQKVLKKVSQEUJTKWTDWEN 
NRG PDY LLH GYR V I ARALQCTYTEQSML I EGT S STG PVPQ A VTVAK DAVTQTVRGA I KNL 
ENPKPGNDPDGVLMQWISLGIEGPTLDPGESIQNFLETRVSDFGGDDSDrDYTSDIARL 
GSALDRVRENH PNEMPRIWI ALARELGAAVHSHATSVRI ANAGKNHTRDWRMANES SRL 
UXIMKVLSVGAWANTMTVLIGDLFE 

CPn_0678 ' 761329 760682 

No robust homolog present in Genebank/EMBL as of 11/7/93 

K 1 1 MSVNP SGNS KNDLW I TGAH DQ H P DVKESGVTSANLG SHRVTASGGRQGLLARI KEAV 

TG FFS RMSF FRSGAPRGSQQ PS APS ADTVRS P L PGGDARATEGAGRNL I KKGYQPGMKVT 

IPQVPGGGAQRSSGSTTLKPTRPAPPPPKTGGTNAKRPATHGKGPAPQPPKTGGTNAKRA 

ATHGKGPAPQPPKGILKQPGQSGTSGKKRVSWSDED 

CPn_0679 762936 761725 

pgk-Phosphoglycerate Kinase 

GYMDKLTVQDLS PEEKKVLVRVDFWPMQDGK ILDDIRI RSAMPT INYLLKKHAAVILMS 
HLGR PKGQG FQEEYSLOPVVDVLEGY LGHHVPLAPDCVG EVARQAVAQLS PGRVLLLENL 
RF H IGE EHPEKDPTFAAELS SYGDFYVNDAFGTS HRKHASVYVVPQAFPGRAAAGLLMEK 
ELEFLGRHLLTSPKRPFTAILGGAKI SSKIGVI EAIJLNQVDYLLLAGGMGFTFLQALGKS 
LGNSLVEKSALDLARrArtiKIAKSRlWTIVLPSD^ 

FDIGPRTTEEF IRI INQSATVFWNGPVGVYEVPPFDSGS IAIANALGNHPSAVTWCGGD 
AAAWALAGCSTKVSHVSTGGGASLEFLEOGFLPGTEVLSPSKS 

CPn_0680 764254 762971 

ygo4- Phosphate Permease 

Y S ML PL 1 1 FVLLCG FY T SWN I G ANDVANAVG P SVG SGVLTLRQAWI AA I FE FFGAL LliG 
DBY&GT IES S IVSVTNPM I ASG DYMYGMTAAL1ATGVWLQLAS FFGWPV STTH S IVGAVI 
G0GEVLGKGT 1 I YWNSVG I IL I SWI LS PFMGGCVAYL I FSF I RRH I FYKNDPVLAMVRVA 
PF£&ALVIMTLGTVMISGGVILKVSSTPWAVSGVLVC^^ 

PifeSLTYRLKERGGNYGRKYLVVERIFAYLQ 1 1 VAC FMAFAHG SNDVANAI APVAGVLR 
QA^ASYTSYTLIPXMAFGGIGLVIGIAIWGWRVIETIVGCKITELTPSRGFSVGMGSALT 
IAISSILGLPI STTHVWGAVLGIGLARG IRAINLNI IKDIVLSWFITLFAGALLSILFF 

FAJjgALFH 

CPfiJ0681 765001 764258 

CT£9*1 hypothetical protein 

NGjISSHKSFTRSFROVI IAKKAII^QTLARLFC^SPFAPLQAHLEMWSCVEYMLPIFTA 
LROGRYEELLEMAKLVSDKEYQADC I KNDMRNHLPAGLFMPISRAGILEI IS IQDS IADT 
AEWAILLTIRRLNFYPSMETLFFRFLEKl^EAFELTMTIXHEFTCQLL 
RIM3RVAKSEHESDVLQRELMQ I FFSDDFI I PEKEFYLWLQVI RRTAG ISDSSEKLAHR 
INMT-tEEK 

Cf!i_0682 764912 765955 

dp;pJD ; -ABC ATPase Dipeptide Transport 
TS^GLHKNSLFFJtfN^PKRSCKRIJlASNPILCIEDL 

fflfa&E IGESGSGKSVSAHAILRLLPCPPFSVSGQVNFQGHNLLTASRSIQKKI IGTEISM 
IF^QNPQASLNPVFTIEQQFREIIHTHLALTA£^AKEKMLYAIjEETGFHDPRLCIiNLYPHQ 
LSj0€3MLQRIC IAMALLCSPKLL IADEPTTALDVSVQYQ I LQLLKTLQKKTGMSLLI ITHN 
MGWAETADDVLVLYAGRMVECA PAVQMFHNPSH PYTRTJLLASRPSLrQPQQLGS FNP I PG 
QPlPflYTAF P SGC RYH PRCSK I LNRC S AEAP E I Y PVREG HKVRCWLYDD 

CPirO>683 765936 766919 

dppJk-ABC ATPase Dipeptide Transport 

GVi^MTTNFPQPLIQATSLTKHYYKRSFWFCGKTIASRPVDDVSFSLYSRRAVGLIGESG 
SGKSTLALALAGLLPLTSGFLTFNGTP IKLHS KHGRHQLRSQVHLVFQNPQASLNPRKT I 
LDSI/aHSLLYHKLVPKEKVIATVREYLELVGLSEEYFYRYPHQLSGGCXXJRVSIARALLG 
VPQLI I CD E I VSALDLS IQAQI LNMLAELQKKLS LTYLF I SHDLAWRS FCTEVF I MYKG 
QIVEKGNTKRIFSDPQHPYTRWLLNAQLPETPDQRQSKPIFQEYHKDSEESCSTGCYFYN 
RC PQKQEACKSE I I PNQGDAHHTYRC I H 

CPn_0684 768056 767181 

spoJ/parB -Chromosome Partitioning Protein 

EKSGDIVTEEISKDTIIEVAIDDIRVSPFQPRRVFSNEELQELIASIKAVGLrHPPWRE 
I CTGDRVL YYEL I AG E RR W RAMQ LAG ATT I PV I LKHVI ADGT AAEAT LIEN I QRVNLN P I 
EMAEAFKRL IHVFGLTQDKVAYKVGKKRSTVANYLRLLALSKTIOESLLQGQ ITLGHAKV 
ILTLEDPILREKLNEIIIQEHLAVREAELIAKQLISEEGSSIELKPTPLDMAESSKQHEE 
LQQRLS DLCGYKVO IKTRGSKATVS FHLQNTQDLQKLEAWLSSHGTLS ES LS 

CPn_0685 768Q2o 768217 

No robust homolog present m Genebank/EMBL as of 11/7/98 

FPOSQYLLIFPNRILDLQAFEILDVQGMLTDQRKHIQMLHKHNSIEIFLSNMWEVKLFF 

KTLK 

CPn_0G8b 768373 7b817b 

No robust homolog present in Genebank/EMBL as oc 11/7/98 

AKDS-MMPCGRLFRV/OELFFFSSVYVCEQRRPRKLYPGLOHLNFPIEKPRFLLKGFKKEL 

HF-'YNHV 

rpnJMW 76850 L 7hJ2l4 

r'T4X2 liypottiutic.il protein 

UK I HKNLRH AYR FfJTPNCRLTMCKLVHN IWKK TY0FG3A t A IC EVLAJFL.';LK I VSNTYK 
H : :Q A K P W: J [ L L LT RAA EVA VSQG FL PS K S AL 'J J L EGA Y H LOG ESM K P Y AG F L Al *C F Y I H N 
KPLiV JAYYA' JLJWMMnOALQLPHP EQKLLKEISLAOADOLYDVALSKJYOLLOTANSLJPE 
YPTr.^'FLTr.LRVrELKELLitODV^ODFAALKSSPLFHQFEPMYrJDGEWTLCKRFGKKG 

rpnJKHM /^;n7^ 770 1 37 

t'T4Hl hy [ * »r hi k .i L protein 
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3LML I VLAFRQVFFGHSRSQLDRLKNYLR LLKONF A ITL PK CRT "K ,K u-MLTFCFASFD 
FYTN r F P F L E EQK r PA WG VA3 R Y I T SNAAQ D Li 1 PG H R L K V ' ; ET LA FO D E I F oNYM P FCC 
QNELrEMAKSPYIQIJVSSGFAIP^Li^PPYLTTEILLJRHHrrriT'JAKPLAFLFPFGK 
SDPTSRKLAADHYPYSFLLCOTrNRKLKTHNIYRLDEKPMQYVCPGLFQ^GRYLKNWIKE 
KSKQLYLKKQLPKR 

CPn_068 < 5 771407 770187 

■/fhO-Ni f^- related Am mot r-imfer-iS' 3 

\"TUaV-V\V^-:'^/?S'.\-' va"' ri " ,,[ tua..\. vMrt/.-'-t v."t.\: 

HHANVLSWEIACRRRGSLVKKI RVHDSGL IDLDDLEKLLNECAQFVG I PHVSNVTGCVQP 
LQQVAELVHRYDAY LA VDG AQGAPH L P I DVQ LWOVDFYV FSSH K I YG PTG IGVLYGKKDL 
LDQLPPVEGC^SDMVAIYDHONPEYLPAPMKFEAGTPNIAGVLGLGAALJDYLDGLSAKFIY 
DKEIALTTYLHKELLEIPGVEr LGPS IEEPRGALIGMTI DGAHPLDLGFLLOLRGIAVRT 
GHQCAQPAMERWNVGHVLRVSLG I YNDEDD IDQF ILVLQDSLDK IRR 

CPn_0690 772704 771436 

ABC Transporter Membrane Protein 

LSVLRGDKVLVSIETFSSIASGSPVQKAAEACYTQYSKOPSSKEV'LSSFSW^QSLSLFPD 
RYNLATGASELIKQHWLHNNHSLAFECILINGKYEPSLSQLPEGVIVCGIDEARGSLSSF 
MC/3FDVNKHPLAFLNAVCSEDRGWrYIPEEMCrSDPIFVRHISFPTVSDHDVIFSPRIV 
VI LGQRASAQIQ ISHDVDLEMVGSSKTI VNGVTELFVGEGADLTVFMVPGYS EEDTLSWS 
TIATVEKDAICRMTQ^^^ESCC<3FGWFDNTSYIVGKKGHAESLVLVQSPRKTWV^^^SH 
DAEETVSRQNI KS ILYSGHFLFEGT I S I SSQGCLSDANQKHDTLLLSSEARVSTFPRLEI 
ETDEVKASHGATVGPLDPQQIFYMRSRGMTEAEAQEKLIHGFLKQGLVSDTFLGSSFQLN 
QTS 

CPn_0691 773467 772685 

CT6 91 hypothetical protein 

RGIijSMLKIKHLHASCNDWILDDFTJLNIQPGTMHVIMGP^AGKSTLAKILA 

S SGE I ALQEQNLLSMLPEERS RAGLFVG FQMP PE I PGVNNKMFLRDAYNARRRANQ EGD I 

SIDEFTCTLIjSTVLETYEYNATTDLFLDRNV^GF 

dsglxvxjaij^icrvlekyrelhptsslcivthkpklgnlirpdvvh 
slmheleaksyqevtkrvawr 

CPn_0692 774945 773461 

ABC Transporter 

IQEFCATGLKVMGESVKVFLEEREDYPYGFVTPrESCjGLTRGLSEETIEEIAALRNEPQF 
1 1 DFRLOAYRYWKQLHEPAWARLHYG P I AYDD I VYF SS PKQKKPLG RLEDADPE I LDTFK 
K1£IPLDEQKRLLNVENVAVDLVFDSV^^ 

YIXSSWSHRDNFFAALNAAVFSDGSFVYVPKGVKCPMDISTYFRIN^ 

EDGGYASYL EGCTAPAYSSNQLHAAWELVAH EHAVIRYSTVQNWYAGDKKTGKGG I YNF 

VT KRG LC AGYRS KI SWSQVEVG AAI TWKY P SC I LKGDES VGE FY SVALT SG KMQ ADTGT K 

MLHVGKRTTSWISKGISSDESKNTFRSLVSLGKKAEHSSNYTC?CDSMLIGKASGAYTO 

KI VVENSTSS IEHEATTSKLR£DQLLYLRSRGLSPEEAVSLVIHGFCRE 1 1 EQLPLEFAO 

EASKLLLIKLENSVG 

CPn_0693 776292 775240 

TPR Repeats (0 -Linked GlcNAc Transferase homolog) 

LRSTNHVU3EISMEEAAKHLAKEFIXSGIKLFLSGEYEQAEKRXJCETLELDSTAALAYCY 

LG 1 1 AL ETG RVS EALNWCS KGLASE PG DS YL R YCYG VALDRGNQY EAA I EQYS AYVALH P 

DDVECWF SI/jSVYHRLKRLQEALJXFDKI LALDPWN PQS LYNKAVI LSEMDDEAES IRLL 

EVAVAKNPLYWKAWVKLGFLLS RSKRWDKAT EAYERWQL RPDLSDGHYNLGLCYLTLDK 

TRIAIJCAFQEALFLNAEDADAHFYVGLAHLDLK^ 

YLH^COETDKATKELLFl^KKDSTFAPLLQKTWSDPSSMQFERRLDT I S 

CPn_0694 779635 776330 

pbp2-PBP2-transglycolase/ transpeptidase 

FSDESEAHNIHSMKilPKKFPIYLSIAQKTNRLLSGIVIAFAVIALRLWLAVVEHEQKLE 
EAYKFQIRVLPQYVERATICDRFGKTLAVNQLQYDVSVAYGAIRDLPTRAWRVDEHGHKQ 
LI PVRKHYIMCLSELLSOELHLDREAI EDAIHAKASVLGSVPYLVAANVSERTYLKLKML 
SKDWPGLH\ZZAWRRHYPQESVASDILGWGPISLQEYKR\rrQELSQLRECVRAYEEGED 
PKLPEGLASIDQVR^LLESVESNAYSLiNALVGKMGVEACWDSKLRGKIGKKPILVDRRGN 
FIQEMEGAVPEAPGTKI^LTLSAEI^AYADALLLEYEKTETFRSAKSLKKREKLPPLFPW 
IKGGAI IALDP^^^GEIIJ\MA£SPRYRKNDFVNAKVAEDSKAVRSSIYRWLENKEHIAEIY 
DRKVPL IRERRNPLTGLCYEEI LPLTFDCFLDFLFPENSVIKLQLKRNSFVGQAI EVQNL 
VTRLLSLFPYEEGTCPCSAIFDAVFPNEEGHILIQEVISLQEQKWIMECLNQHKADIEEL 
KEALDQVFNEL PANYDK I LYTD I LRL I VDP ER FS PVL PS EVHRLSLSEFT ELQG RYWLR 
S AF STI LEDAF I EVHF KSWRKS EFLQYLAAKRQEEALRKQRYPT PYVDYL EE EKTROYKM 
FCQEHLDTFLAYLFSKTPYKEGLEPYYDILDLWINELDNGAHRALSWHEHYLFLKERVSH 
LS EHLPALFSTFREFNELQRPLLGKY PIS IVRNKRQTEQDLAAS FY PVYGYGYLRPHAYG 
QAATLGS I FKLVSAYSVLSQRILWGHNEEPANPLVI IDKNSFGYRSSKPHVGFFKDGTPI 
PTFFRGGSLPGNDFMGRGFIDLVSALEMSSNPYFSLLVGEGLGDPEDLADAASLFGFGEK 
TGLGLPGEYAGRVPHDI^YNRSGLYATAIGQHTLVVTPLQTAVMLASLV^KX^WYVPKLL 
LGEWEG EHVS YLSSKKKRTIFM PDAWEVLKTGMRNV I WGOYGTAR A IQSOFPPOLLSRI 
IGKTSTAESIMRVGLDREYGTMKMKDIWFAAVGFSDQDLSLPTTWI\A'LRLGEFGRDAA 
PMAVKM I DMWEK IOORESFLRG 

CPn_0695 780201 781382 

homologous ro CT695 

SLEVSMKKLLKSALLSAAFAGSVGSLQALPVGNPSDPSLLIDGTIWEGAAGDPCDPCATW 
CDA I SLRAGFYGPWFDRI LKVDAPKTFSMGAKPTGGAAAiNYTTAVDRPNPAYNKHLHDA 
EWFTNAGF IALNIWDRFDVFCTLGASNGY I RGNSTAFNLVGLFGVKGTTVNANELPNVSL 
SNGWELYTDTSFSWSVGARGALWECGCATLGAEFOYAOSKPKVEELNVICNVSQFSVNK 
PKGYKGVAFPLFTDAGVATATGTKSAT I NY H EWQVG AS LS Y R LNS LVPY IGVQWSRATFD 
ADNIRIAQPKLFTAVLNLTAWNPSLLGNATALSTTDSFSDFMOIVSCQINKFKSRKACGV 
TVGAT L VDAD KWS LTA EAR L INERAAHVSGQFRF 

CPn_0h4b 7R1703 7825't 1 ) 

CTh^t> hypnt her tc j1 piot*-itt 

N. f ;G F YV RM P L LT V :*NF E I EV'0: ' LEC.Q : K 'KLT I KDLM^AGAHF^ J! IQTRRWNPKMKLY I FEE 
KNGLY [ INLAKTLvVLI'MALi ff [RKV [ODNKTVLFVriTK KQAKf 'VI RF.AAIEAGEFFIAE 
RWL/r.MLTNMTT TRN'; [KTLDK [ EKDLSRNOAYrvrKKCAALLAKHHOKLLKNr.CG [RYMK 
KAP^iLLVWDr^YEK IAVAlI\KKt / > I I'VI .ALVDTNf'Dl'Tf' I DHV I f 'rNDDL'I ,K:! TRLI IN 

vtYvm iFj\KHKU^[r.rv'nvK';[j-v!-[a.:;Ar!:::r f (.)[XjL:;iJt-;i:N^Ki-iM,r^KKFr-x;E-AN 

i'F*ri_t)t.') 7 l^^ii','! /H ;/\/\ / 

f, L Eionq.ir u'n \'.t< rot T, 

W}^ \Y I ,M. f TjF' IMi'Tt .KTLPWI'f JVM.TKi KI .Al ,1 'At f V INr.KKAWYI ,HKU II .A; 'tV '.KKEHR 
LTKWJI [ AAKTP \N< ITAL I L'VMVf Tf)l VANNAVP'KKFV.'INr.LNn 1 1 ,K"i KVI/I'VKAf ..JjAA 
'V.:<jlA".\L ;VDELI\-\VTM f JTVGKn I \ f \ ^l/VAYI'f'KA'ITJIITVG f Y.'iHdNt IKTVAI iTH[.;;f 



104 



TADSLAKD I AMHVVAAQPQFL3KEGVPAEA I AKEKEV I A3Q IQGKPQEVt EKIVTGKLNT 
• F FQF^CLLEQPFIKNADLSrQSLIDDF^KTSGSSVAI EQFILWKIGA 

-. ; - n _06^fl 783443 7(34201 

pyrH-UMP Kinase 

EPNKNMAKQTRRVLFK ISGEAL3KDSSNR IDEMRLSRLVSELRAVRNNDIEIALVIGGGN 
ILRGLAEQKELO rNRVSADQMCMLATLINGMAVADALKAEDr FCLLT3TLSCPQLADLYT 
POKS r EALDOGK [L [^TTGAG^PYLTTDTGAALPACEIJ^VLTKATMHVDGVYDKDPRL 

■■'f-nA*. r - vr»FV:"t' ! : »r ! . r;.n . ;v;-v/ :/ p Mr. j; :~ frvf ■i-l/^i- ,Li -VvL,rnrri<rr; 
V:*i-:LVNHV :::!!■:!] 

CPn_0699 784179 784721 

rrn-Ribosome Releasing Factor 
TMSVLQDTEKKMAAALDFFHK£VK3FRTGKAHPALVET\A^ 

LRQLVIS PYDGNNASAIAKGI I AANLNLQPEVEGS I IR I KVPEPTADYRQEMIKQLRRKC 
EEAKI^rVRNIRREA^roKLJCKDSALTEDVVKGNEKKIQEt.TDKFCKQLDELTKQKEAEIAS 
I 

CPn_0700 785094 785609 

CT676 hypothetical protein 

lmvhspthckt^hcqqpaticyteidkdkvirswcatcpcpshyynnehi^lskgvgvlt 
lecgncktvwhskqddeqli^hcc™fknqitsklkserwsssftmekgcoslhigr 
apgeasntnpllklialnealodtleredysqaavi rdq i nhlktknpddps 

CPn_0701 785584 786672 

karG-Arginine Kinase 

KPKIQMTLPNDLLETLVKRKESPQANKVWPVTTFSLAi^LSVSKFLPCLSKEQECLEILQF 

ITSHFNHI EGFGEF IVLPLKDTPLWQKSFLLEHFLLPYDLVGNPEGEALWSRSGDFLAA 

INFQDHLVLHGIDFC^NVEKTLIX5LVQ]^SYLHSKI,SFAFSSEFGFLTTNPKNCGTGLKS 

QCFI^IPALLYSKEFT^IDEEVEIITSSLJXGVTGFPGNIVVLSNRCSl^ 

LRITASKLSVAEVAAKKRLSEENSGDIJCNLILRSLGLLTHSCQLELKETLDALSWIQLGI 

DLGLIKVTENHPLWNPLFWQIRRAHLAIi2KQAEZ3SRJ3U2KDTISHLRAS 

ESF 

CPn_0702 789700 786929 

yscC/gspD-Yop C/Gen Secretion Protein D 
LKKNFVKTVILNIGRKILCXjIKi(KKKKIGILSGLFFI£>L^ 

RDEKLAACPKNSAASLSAKKSHTKKTTPGSIPSKVFSKFDATQDKTFQKTSGSAFPAKPT 
TLKELE ERKK PR P ERRTTADVK RS P RFL PTQ EVE E P VP AAS KEQLD S I QVWE EKQNYARR 
AVNAINLS I KKQLEEQTSTVTEKDVQ PKTQATPHASKKNVASPSTSMPG I EKAATTVAVP 
Q DKSEEEKVKERLTKR ELT CEDLKDNGYTVNF ED I S I L ELLQ FVS K I SGTNFVFDS NDLQ 
F&viT I VSHDPTSVDDLST ILLQVLKMHDLKWEQGNNVL I YRNPHLSKLSTWTDSSLKE 

tceawvtrvfrlysvs ps aavni iqpllshdai vsaseatrhvi isdiagnvdkvsdll 
a^pcpgtsvdmteyevkyanpaajlvsycq 

la^taeqlijcsldvpemahtr^dpastalaijggtgtts pkslrffmyklkyqngevi ana 

lqijig ynlyvtt amdedf i ntlns iqwlevnns ivi ignqgnvdrvigllngldlppkqv 

yjeylili^slekswdfcv'c>a/algdeqskvayasgllnntc 

i pept pgqltgfsdmlns s s afglg i ignvls hkgks fltlggii^aldqdgdtvi vlnp 

rj^qdtqqasffvgqtvpycttnti iqetgtvtqn i dyed igvnlwt stvapnnwt l 

qxzqt i s elhsasgsltpvtdktyaatrlq i p dgcflvmsgh i rdkttkwsgvpllns i 

p§i;eirglf srt i dqrqkrnimmf i kpkvi s s feegtrvtnkegyrynweadeg smqvaprh 

AfgCQGPPSLQAESDFKI IEIEAQ 

CPaio703 791205 789685 

P&rS-S/T Protein Kinase 

RK I G FMDCRGG I PL PE PQV IGGYHVKKI LS KKLRSR WHGLH P ETRHSTV IKVFS P S P S F 
TSRSVYNFLKEAQSLHQ ITHPNIVKFHRYGKWQDCLYIAMEY I EG I SLREYILAQF I SLP 
QA. ID I I FDIAQALEHLHSRWILHKDIKPENILITPQGKIKLIDFGLADWDTEIQRAHPSV 
IG^PYYMSPEQRCGESHSPASDIYAIJ3LI^YELIIJ3HLSI^RWLSL\^ERrSKILAICAL 
QftSIP^RYSSTREFIQDIHHYRMSGDMQEDLRIKDHTVALYEOLQTQRFWLAPETLRFPD 
F f SGVLYHQGYPLYPHAYDTLLEGDVFNLWLGYSP I SNAT I ALSWKSLVCQQDLQRPLL 
DteEINECLIRMKIPIDE^ISILCLEISKE^ELSWIACGKTVFWIKRQGRWQDFES 
F$ PGLGK IT SLQ I RETKVAWE I GDEAWCT LELEESVAS LKTLSLAELQDRRQKAI FCP I 
E$£HGGIQSRQHGSNSPSTLISLKRIR 

Cfep0704 792330 791209 

fliisT- Flagellar Motor Switch Domain /YscQ family 

RY^^VAADSSASWLKSRNNFLSSLGKTEEQVAAPEFPKELCQHKIREKFRLEDVQVSIK 

F RG S IT AVEATKE FGVHLL IQPMWQ PWEVENLLFLT S E EDLQ ELMVAVFDDASLASYFY 

EKDKLI^FHYYFVAEACKLFEELQWVPSLSAKVGGDAIFTATSLOGSFQVVDISLRLDGK 

NVRCRLLLPEDTFQSCQKFFSGLHDESDLHNIDQTOQISLSVEVGYSQLTQEEWHQVVPG 

SF IMLDSC LY DPETEE SGALLTVQKHQ FFGGR FLTPSSG EFK I T S Y PNLTHEDPPLPENP 

QAS AAPLPGYSRLWEVARYSLAVS EFI KLNLGS ILSLGNH PAYGVDI I LDGAKVGRGE I 

I ALGDVLG I RVLEV 

CPn_07Q5 79317 6 792334 

CT671 hypothetical protein 

FMELKKTAESLYSAKTDNHTWQNSPEPRDSRDVKVFSLEGKQTRQEKTTSSKGNTRTES 
RKFADEEKRVDDEIAEVGSKEEEQESQEFCLAENAFAGMSLIDIAAACSAEAVVEVAPIA 
VSS IDTQWI ENI I LSTVES MV I S E I NG EQ LVELVLDAS S S VPEAFVGAN LT LVQSGQDLS 
VK FS S FVDATOMA EAADLVTNN PSQ L S S LVS ALKGHQLT LKEF S VG NL LVQLP K I EEVQT 
PLHMI AST I RHREEKDQRDQNQKQKQDDKEQDSYK I EEARL 

CPn_0706 7<U68 L "> 7^3180 

CT670 hypothetical protein 

YAVAKY PLE PVLAI KKDRVDRAEKWKEK RR LL E I EQEK LR EK EAE RDK VKNHYMQ KIQQ 
LRDLLDEGTTSDAVIjQIKSYIKWAVQLSEEEEKVNKQKEVVLAASKELEKAEVNLAKRR 
KEEEKTRLHKEEWMKEALKEEARAEEKEQDEMGQLLFQLRQKKKRESGGS 

r:Pn_07()7 MS0 JS 7^3704 

y:;cN-Yop N ( F Luge L la l -Type ATPdse) 

VN M DQLTT D F DT F ,M I3QF. ,t j DVNLTT WG R rTEWGML [KAWPNVRV^E.VCLVKRNGMEPL 
VT EWGFTO S F A F LS PU E LSG V J P S S EV I PTG LPLH I RAOfJClLL^RVLNGLGEP I DVET 
Kf ;pf Jjtr /IX^TI-'I * [ FR APPD Pt.H RAKLRO 1 LiJTt ;VRC ILCMLTVARv'OR IG I FAGACVGKS 
:;LLt;MI AHNAKKAIJVNV LAL U JERtJREVREF I EGDL/,ED JMKR A I W^T^DQSSQLRLN 
AAYVr;TAIAEYFRL/J^:KTW[>^MD:;VTRF^RALRC7^I^V J EPP^RA^.;YT^'^VFSTL 1 PRL 

r,ER:x;A:ji)K(vr rTAFYTVi.vAiinnMNEPVADCVK.-; r uy-.n rvi. :n \ laoaymypaidvla 
:;r:;RLLTArvrT-:EVRR i i^karcvlakykancmlfp i r ;eyrf^ >pke i pfa [ dh i dklnr 

R ,KQb 1 1 f EKTNYEEAAgOl >RA I FR 
'•J-nJi'/')H /•>',/ IS MS0M 



> i 

CTb68 hypothetical prormn 

AFKTVKRFFCFMIDPVECFPMLDGDAEAOSrTON-C ITFLAi'CLKKD "PFAUWYAAPKD 
TTLVCOFKPNPMAMM0D0N:;fJL IDPELQEALEGEELQEO I NNLKGR LWDFRSTFEDSQTT 
AO FAD EH FQAVGV T rDLINEDLNTIAEHTQQDARKEDKEEGSVTRK 1 1 DWVSSGEEVLNR 
ALLYFSDRDGNRE3LANFLKVQYAVQRATQRAELFAJ I VGT3VSSVKT IMTTQLG 

CPn_0709 7^*;203 7^5742 

■rTho7 hypo t her i <~-i L pror^m 

QV I ANCK I ESTRALAOS VLDWHDT LVAK JAG PLG 

CPn_0710 796482 796210 

CT666 hypothetical protein 

RSRGEKSMATNKSCTAFDFNKMLDGVCTYVKGVC^YLTELCT 
Q I LSQYMESVSNI LTAVNTEMITMARAVKGS 

CPn_0711 796791 796486 

CT665 hypothetical protein 

TT I NNQVLG F INYL YLGRYSMFNMENTAKEEKNSQ P LLDLEQ DMQDHDRAQELKASVQDK 
VHKIJiAIiREGSDKESFGO^SLLAGWALQKVLXIRrNRJ<MI 

CPn_0712 799315 796781 

FHA domain; homology to adenylate cyclase) 

MAVRL. IVDEG PLSC/I FVL EDG ISWS IGRDSS AND I PIEDPKLGASQAI INKTDGSYYIT 

NLDDT I P I WNGVA I OETTQLKNEDT I LLG SNQYS FLSDEFDPQDL VYDFD I PEENFSND 

SGDLSDSNEC^KDLEPRC/TSETNHSPKPKEKLTKI)CX;SSDPITSGDQELADAFLASAKAE 

KNQPRAKVAKKGLKESSNESLNPKEQNAKDSPKGEERTNKPQNAIMEDNGASPRQDPQPK 

SAEPSLKNTARDETPLKENKPVEEKANKKATPDSPEKKDQPEEGSKKEGSKIEATPLDSQ 

KESEDKEAEEAFVQEEEENLTEDNKEDSDSAADANDDTASDHTAEDNKETPKKVENEKSA 

VLSPFHVQDLFRFDQT IFPAEIDDIAKKNI SVDLTOPSRFLLKVLAGANIGAEFHLDSGK 

TYILGTDPTTCDIVFNDLSVSHQHAKI TVGNDGG IL I EDLDSKNGVIVEGRKIDKTSTLS 

SNOWALGTTLFLLIDHHAPADT r VASLS PDDYS LFGRQODAEALERQEAQEEEEKQKRA 

T L PAG S F I LTLFVGGLAI L FG I GT AS LFHTKEWPL EN I DYQ EDLAQV I ffQ F PTVRYTFN 

KTNSQLFLIGHVKNSTDKSEI^YKVDALSFVKSVDDWXDDE^VV^EM^ 

I SMHS PEPGKF I ITGYVKTEEQAACLVDYLNI H FNYLSI^E^^CWVETQMLKAI AGHLLQ 

GG FANI HVAFVNGEVI LTGYVNNDDAEKFRAWQELSG I PGVRLVKNFAVLX. PAEEG IID 

LNUm^YRVTGYSRYGEISINVWTCRILTRGDVirc 

IDYNK 

CPn_0713 799817 799332 

CT663 hypothetical protein 

I^LKEEKAGFRNEIVSIPGXn'KTTrAALENTSMLEKLIKNFATY>GI 

LP I SEWKVRAQQNADNE IVLS ASIjGALP PSADTAKLYLQMM IGNLFGRETGGSALGLDS 

EGNWMVRRFSGDTTYDDFVRHVESFMNFSETWLSDLGL^ 

CPn_0714 801125 800091 

hemA-Glutamyl tRNA Reductase 

NY R I VI^IVLGWG I S YREAALKERERA I QTLQ S F EKNLF LAQ RF LG KGG AF I PLLTCHRA 
ELYYTSESPEIAQAALLSELTSQGIRPYRHRGLSCFTHLFQVTSGIDSLIFGETEIQGQV 
KRAYLKGSKERELPFDLHFLFQKALKEGKEYRSRIGFPDHQVT I ESWQEILLSYDKS IY 
TNFLFVGYSDINRKVAAYLYQHGYHRITFCSRQQVTAPYRTLSRETLSFRQPYDVIFFGS 
SESASQFSDLSCESLASIPKRIVFDFNVPRTFLWKETPTGFVYLCIDFISECVQKRLCCT 
KEGVNKAKLLLTCAAKKQWEIYEKKSSH ITQRQ ISS PRI PSVLS Y 

CPn_0715 801636 803462 

gyrB-DNA Gyrase Subunit B 

KFNKISHMAAYTEAS ILSLASLDH IRLRAGMY IGRLGNGSQKEDG I YTLFKEWDNGIDE 
F IMGHG KSLK I SAS DKQ I S I QDQGRG I PLGKL I DCVSK INTGAKYTQDVFHFSVGLNGVG 
LKAVNALSE I FSVRSVRKKKYHLATFHRG VLQ ES KQGST KDPDGTFVSFT PDPS I F PEFT 
FNHDFLKDKIRQYTYLHSGLEIRFNDEVFISHNGLKDLFDAEITEPPLYSPLFFQNEDLT 
FIFSHLEGOTERYFSFV^QETLDGGTHLTAFKEAIVKGVNEFFGKTFVSNDIREGIVGC 
IAIKIASPIFESO/TKNKLGNTQIRSSLIKDVKEAIVQALRKDKVAPEIJjLEKIKFNEKTR 
KN IQF I KQDLKSKQKKVHYKI PKLRDCKFHYNDRSLYGEASS I FLTEGESASASILASRN 
PLTQAVFSLRGKPMWFSLEETKMYK^ELFYLATALGITQNEIQHLRYNKVILATDADV 
DGMHI RNLL ITFFLKTI^PLVTWNHLFILETPLFKVRNKTTTLYYYSEQEKMQAJ^QQFGK 
KDSSLEITRFKGLGEISPKEFAAFIGPEIRLTPVTITSLESISSILQFYMGKNTKERKQF 
IMDNLITDF 

CPn_0716 803466 804902 

gyrA-DNA Gyrase Subunit A 

FMRDVS EL FRTH FMHY AS YV I L ERAI PH I LDG LKPVQRRLLWTLFLMDDGKMHKVAN I AG 
RTMALHPHGDAPIVEALWLANKGYLIDTQGNFGNPLTGDPHAAARY I EARLS PLARETL 
FNTDLI AFHDSYDGREKEPDr LPAKLPVLLLHGVDG IAVGMTTKIFPHNFAELLKAQI AI 
LNDKKFTVFPDFPSGGLMDPSEYODGLGSITLRASIDI INDKTLWKOICPQSTTETLIR 
SIEMAAKRGTIKIDTIQDFSTDVPHIEIKLPKGSRAKEMLPLLFEHTECQVILYSKPTVI 
YENKPVECSISEILKLHTTALQGYLEKELLLLQEQLTLDHYHKTLEYIFIKHKLYDSVRE 
VLA I NK K I S ADDLH0 AVLH AL E PWLH ELAT PVT KQDTSO LAS LT I K K I LC FNE EACTKE L 
LAIEKKQAAIQKDLGRIKE\.TVKYLKGLLERHGHLGEPKTQITNFKTAKTSILKQQTLI 

CPn_0717 804^68 80530b 

CT656 hypothetical protein 

I R I KF I DT IT I WRME P R H I Y T RK P ET PKA P DV EK PC V P EYMTMANT PT F EG PVKT LDQ L 
RRAL I EQRGAEECOKMYDNF IQS I LI STFGLVHKDMDRAQKASKRMRSVYKEO 

CPn_ f )7 1B a05300 80562b 

CTfi57 hyporhetical protem 

RA'/Il^rTYFLALPVDRLMQERFLCSPKRWAPFINSPLYLTLIADHDTrYLAKNLDKFPLP 
VEC/WEKTVLir/.^SLLKSrFLv'SDLSSLRLLALTKFEILTLNDLYfAONI 

~.trd' (p'.tMKjrjui ui i no .'ynrhaL,* 1 ) 

r d r y lvkk v t k r ; :mktvts ftvck CN.^f ;r luk y lt evmpky r z r AKYy n 1 1 vr/: LVQ I NGQ 

I N'P'VA I'RLHrflP 1 \T TD rOFKEF.LLI-ILLPFJV r I'LUVrfELflM I r>V INKPRUf^WHPAPG 

iif-[r;rr.viiAi.r.Hr.uu:Rr,KFE.Fr'i-:t:pwRW ;t vnRLDKr/r r /;Li ttaktro^kkvpitclfs 

TKr*:,KK , :YI.A7*'h'KI I' 1 ITT L 1 !TH r: IRMIJNKRKFWTV'/ liAVPl K 'OVLAFNt jKLSFV 

a;, pi-.ti ;rtiiolk\ hmkhlaVit f ! / ;i h % vy( ; [ p**Mrjr;';Y**;LrjKfjQ[ i i iay ivurnipRTROF 

< ".I KA*.[.!'!'|jN'i<"t.[,IK[ I'KNETT I L.NKN[«[X:; TLKFO 

1 It Ji I M ho,. >•> I no )*, 

t T*. l ,'i hyi",! h'-t u .i! pt«>l "in 



105 



LR [TMKEFLAV T rKNLVDRPECVR E KEVQCTHT [ I YELJVAKPD TGKI IGKECRTIKAIR 
T LL VfTVA: j R NTfVRVS LE I M EEK 

OPn_0721 H 0767 1 8084 89 

kdsA-KDO Jynrhetase 

KRMVMFNNKMIL I AGPCVIEGED ITLEIAGKL03 ILAPYSDR IQWFFK^S-i-DKANRSSLN 
:7FRGPCLTEGLR I LAKVKETFGVG I LTDVHTPQDAYAAAEVCNI LQVPAFLCRQTDLLVA 
TAETGATVNr,KKGOFI^PWnMFT;PTrir/ T ,^rffJKr:JTFPGr^F'T^irir 1 y^DMRSIPVL 

.■<<.•< ;mv ; f r /_!,[*,*:. ti . . "< * vAwV t'i tnt* IAK" 

LAA;JML.JLiiI^FAALLlTWE/.-Lr't'' V r M. / 

CPn_0722 808477 808974 

CT654 hypothetical protein 

YGLSMTKFLYCGLFYSLGLLVIAFGTMVA I IQVDQ ICDVSCMNKHFQES PPFLKIKKVNV 
SKQICSPEERFFHCKIDKSCMELHFPQSSYSCKEYLTRISGHILTQNFEKQMQFRGNSGL 
LNYQ DGSLHVYDCRFQVDPVPGYGS PDKEDSS SGGMKTLY LS LF RN 

CPn_0723 808978 809703 

yhbG-ABC Transporter ATPase 

ASMPILSVC^VT<KYNKKPVTNDVSFQ 

K 1 1 FKNVDVTKKTMDHRARLG IGYLAQEPTI FKELTVQDNL I CILEI IYKARKQQSHLLN 
T LVDDLQLG SCLHKKAGTL SGG ERRRLE I ACVLALN PSVLLLDEPFANVDPLVIQNVKYL 
IK ILAGRGIGILITDHNAKELLSIADRCYLI IDGKIFFEGSSSQMI SNPMVKQHYLGDSF 
SY 

CPn_0724 810602 309706 

No robust homolog present in GenebankV EMBL as of 11/7/98 

RTSTRLDYKSGCILSKILPFPELWKMLLGFLCDCPCASWQCAAVANCYDSVFMSRPEHKP 

NIPYITKATRRGLRMKTLAYIASUOARQLAYDFU<DPGSLARUUCALIA^ 

FFYGCSNIEDILEEMRRPHRILLLGFSYCQKPKACPEGRFNDACRYDPSHPTCASCSIGT 

MMRLNARRYTTV 1 1 PTF ID I AKHLHTLKKRYPGYQ I LFAVTACELSLKMFGD YASVMNLK 

GVGIRLTGRICOTFKAJKLAERGVKPGTVTIL^ 

CPn_0725 810829 810587 

CT652.1 hypothetical protein 

SC G DVGMF FAPL LYES LRRG LMH PT S HMQQQ LARL EF INDQL.TTELEHVNELLC SLGFP E 
GLTT I KAI AEEVLSDDEPLLD 

CPn_0726 813384 810880 

CT620 hypothetical protein 

AR, JDM I YST S I ST FYKKLSLVS SMH S FAQRHRESLEH IANYEKTTAERD I LKRL I EVLDQ 
RsSSERYRSAVEKLHKYEVERATVAKS I PVAAI HEKPLSSTHASVQVTASTPAATGSGVGA 
YYj^ATCQKWAQDLrVEUTIWITIMA 

F^LYNFPEEIFTAIQRAOTFTGG>!KTDFTNQ1AGKYGNQATLTQTFADGRVEGFKDILT 
A\5S£VLTPEQFT I FAEIATELQAIADHVGNFDEAGLQRIEDAGEKLAAVINSSDLTRNDK 
It&CQH ITDLYSDQVAALGSFDTVLDAS IYVNQHQGTMFSNLSSFVGSL IGTFAPIDLSS 
SQ&D I S S AALAGALQTARG LNS RFNELTAEQQKL INEC I KS L VT FKCG E KLG A IWA YFT A 
STWALNPTATMDHVKAA I LEEAKELDNSSFQLAS S I KSAMTS IVNSSG SFSVTVNSSTL 
QffiYSEKNGKVEINQIU^GSTCFLPEITKLAKTNA^ 

Nk^DLQSQLQQFTNMKTELFDGQIXSQASE^RALPLPSAVASVLIDRYMPKE^ 
YK-JSLYYSNLGSS I GNS 1 1 DA I SQYVNGATYFNFASYVGQQPAVGAGGANAF PGSQESAQA 
KttQQERKQAALYLQET RGALTVIEEQRARVLKDDKI TNEQRST ILDSLRNYEDN INS I SG 
S|$LLQNYLC:PL S IAGGSVAGT FEVK EGQEQWQARLQ I LEEALVSGLVGNMINGGMF PLC; 
ST'l^SDQQSFADMGQNFQLDLQMHLTSMQQEWTVVATSLQLLNQMYIiSLARSLTG 



CPn_0727 813559 816192 

C$619 hypothetical protein 

KYYLFSMSTFSIQNRLRTISGESTRI IKLDHKYSGFDPRSVPAINLEELNSG IYALRHLM 
NALQ S ENTNVAALLNPNNT I F PTTSWTDYKHS RPQAS S PRAPSSQTPTD IVSAAALALVL 
VIDGGLAELVASVTEIDLjGALSTISTVRQI^ASYI^ 

LEH^QEKAAEIQAKQEEIKAVLEAKGVSTEEIEAIIJCEYPDIYAADFFKEFIEEPLHTY 
RAKVGAPIQEMNENAIQLLPTPPAITPDNVNEVNGMNTLST I LQAI DDAIKQAPALGGDQ 
EtlTILQTLVPLVDKTTFTKAEFDLIYTATQ^Pbn'ASLKLYLTDROIAEYRGKITKVYQN 
S te&ILSETK RWENNRSML ETQ LSMFQQAQNC FVTWI SQ ANALNI A I TNKY I SAVLTTSM 
E3^GGLLCLSYMYERLADDEKAIFDKSVNEYLPIHIWGGSW^K>7IAKMAAYQEX^^ 
USf^VTSQDQ I KAYLQTRGNEFKATRHFFHNIGDQMYQFANETVFGNCLTTANGAIQPDL 
GG REAMTNVGTVEADYVSNAQR I LNEFNTAAT AHVLQ LQ LQ IAELQKKADDLDPGKAS 
FTH^IRKFAVAAWITSESI^DALISMILNSOLPKQEAFLKPLIEEINFNNLAANALNS 
ITNEFSTTSVYYSLSSYLVQSKTGQNLFAGDYYETLLAAAREREYIYRDTARCKQAINLV 
NGLLQKINSL PGAT S AQ KQ EM LNATT YYQ Y S L SVT LNQ LTVL E S LLAGL KMT LCTT SNNK 
YDKS^/FKIESFDWIPTUVO^ESFLTSGFPNISATGGI^PLFTQVQSIXXTI^TSCGCTrQQ 
LNLQNQMTT IQQEWTLVSTSMQVLNGILSQLAGAIYSN 

CPn_0728 818483 816525 

CHLPN 76kDa Homolog (CT622) 

VFMVNPIGPGPIDETERTPPADLSAQGLEASAANKSAEAQRI AGAEAKPKESKTDSVERW 
S I LR S AVNA LM SLADKLG I A3 S NS S S ST S R S A DVDSTT AT A PT P P P PT F DDYKTQAQT A Y 
DT IFTSTSLAD IQAALVSLQDAVTNI KDT AAT DE ET A I AAEWETKNADAVKVGAQ I TELA 
K YAS DNQA X LDS LGKLTS FDLLQAAL LQ S VANNNKAAEL LK EMQDN PW PGKT PA I AQS L 
VDQTDATATQ I EKDGN A I R DAY FAGQNASGAV ENAK 3 NN S I S N I DS AKAA I ATAKTQ I A E 
AQ KK F PDS P I LQEAEQ MV I QAE KDLKN I K P ADGS DV PN PGTTVGG S KQQGS S IGS IRVS M 
LLDDAENETASILWSGFRQMIHMFOTENPDSQAA(^ELAAQARAAKAAGDDSAAAALADA 
QKALEAALG KAGQQQG ILNALGQ I ASAAWSAGVP PAAASS IGSSVKQLYKTSKSTGSDY 
KTQISAGYDAYKSTNDAYGRARNDATRDVINNVSTPALTRSVPRARTEARGPEKTDQALA 
RV T SGNSRT LG DVYSQVSALQSVMQ I IQSN PQANNE E I RQKLTS AVTK PPQFGYPYVQLS 
NDSTQKFTAKLESLFAEGSRTAAEIKALSFETNSLFIQQVLVNIGSLYSGYLO 

OPn (J72'> 81^905 fl 18592 

CHLPN 7fikD.i Homo loo <CTS2 3> 

pawssvjtln i dtkdtmkkovyowla.swllalt ecgyafxpl:;eokvkshtyttldevk 

DY L^> K RG FV ET R KQDG V LR E \G DVRARWL Y F R ED E K t J DK D K Y N P L PV NR Y RSEFY L Y I 
[)YRAERNWL.T:;KMNV/TA [A^XIENTAAGVD tNRArLGYRFYKNPLTRTDFFME EGRSGLGD 
L.I-'EJJEVUFiJ^NFCX'.LH E YWTREL::KD\ PY^VrVIKTJPFVWMTKKHY^WWEc; ilnrlpk 
tjl- E-*VKC:;VVDWNTbVPf;E*r: TTHKAATNAMKYK Yt .-/wrjWLvr ;kh : vvf w t mgokkplyly 
f ;a f r Mt j i ' la k A' r kt r lng k l aw r [ c x :t l - y ; lr y am jw : a' r v r v v y v l:a l r : v pe e ovr/ ; 

UiWMl . LK K W K A(JA [ AAN Y D P K E ANt 5 FT NI Y K f ! F' ' AL /MY< : ITh' -L. -FRAYGAY-'JKPANDK 

ijj:;DFTi'!<KF[H;jr [';ai' 

<:vujiVJ u) h z it, it, s yi'M, i 

niviN frtl.iMi.ii Mt-mtit.uie Ptor.-Ln 

::CKK( I'ltll'-.NCI.M^RKlJNr.VSLAHSrrNLLJG'ITi Yx* fT f WYW. f AM YrYI'GAlXTVAArW 



l/jFrtvfflpk r lggl i LcoAr r phfeflracsldra^\fffrrf ;rl : k-;l*t i iftlli e 

AVLWWLQYVEEGTYDM I LLTM I LLPTG I FLMMYNVNGALLHCENK FFGVGLAPVWNI I 
WIFFVEAARH3DPREP r [GL^VALVIGFFFEWLITVPGVWKrLLEAKCPPQEHDSVRALL 
APL5LG I LTS3 1 FQLTJLLGD ICLARYVHE IGPLYLMY5LK I YQLP I HLFGFGVFTVLLPA 
tSRCVQREDHERGLKLMKFVLTLTMSVMI IMTAGLLLLALPGVRVLYEHGLFPQSAVYAI 
VRVLRGYGA3 1 E PMALAPLVGVLFYAQRQ YAVPLF I G IGTALAN I VLSLVLGRWVLKDVS 
GIGYATo ITAWVQLYFLWt'YGSKRLPMYSKLLWEG I RRS [KVMGTTMLACMITLGLNILT 
OTTYVrrLNPLTPE AWPLnil TTAOATAFr.nESCIFLAFLFGFAKLLRVEDLTNLASFEYW 



CPn_073I ^Jl/bO 

No robust nomolog present in Genebank/EMBL as of 11/7/98 
VAIAISRNIPVrRLQr/PDNILKIERAKETSLSFLLIKPFSPPPLKQDYLFDISPYTSSE 
TT IGGS YFKLNKAS LQSSTLRLRS I S 1 1 S 

CPn_0732 822092 822976 

nfo-Endonuc lease IV 

NFMKVLPPPSIPLLGAHTSTAGGIJQ^AIYEGRDIGASTVQIFTA^RQWQRRALKEEVIE 
DFKAALKETDLSY IMSHAGYLINPGAPDPVILEKSR IG I YQE ILDC ITLG ISFVNFHPGA 
ALKS SK EDCMNK I VSS FSQSAPLFDS S P P L WLLETTAGQGTL IGS NFEELGYLVQNLKN 
QIPIGVCVDTCHIFAAGYDITSPC<jWEDVIiNEFDEY7GLSYLRAFHI^^ 
HAPI^EGYIGKESFKFUTTDERTRKIPKYLETPG^PENWQKEIGELLKFSKNRDS 

CPn_0733 823739 823101 

rs4-S4 Ribosomal Protein 

GLKYMARYCGPKNRVARRFGANIFGRSRNPLLKKPHPPGQHGMQRKKKSDYGLQLEEKQK 
LKAC YGM IMEKQLVKAFKEVI HKQGNVAQMFLERF EC RLDNKVYRMG FAKT I F AAQQLVA 
HGH I LVNGRRVDRRSFFLRPGMQI SLKEKSKRLOSVKDALESKDES SLPSYI SLDKTGFK 
GELLVSPEQDQIEAQLPLPINISWCEFLSHRT 

CPn_0734 823863 824915 

yceA 

QNTKEHFSSNGNFLQCNYFQDYVRVF IMEKKYYALAYYYITRVDNPHEE IALHKKFLEDL 
DVSCRIYISEQGINGQFSGYEPHAELYMQWLKERPNFSK IKFKIHH IKENIFPRITVKYR 
KELAALGCEVDLSKQAKHISPQEWHEKLQENRCLILDVRN^ 

REFPEYAEKIAQECDPETT PVMMYCTGG I RCELYS PVLiLEKG FKEVYQLDGGVI AYGQQV 
GTGKWLGKLFVFDDRLAI PIDESDPDVAP I AECCHCQTPSDAYYNCANTDCNALFLCCDE 
CIHQHQGCCGEECSQSPRVRKFDSSRGNKPFRRAHLCEISENSESASCCLI 

CPn_0735 825680 825003 

♦Uridine Kinase (Uridine Monophospho kinase) (Pyrimidine 
Ribonucleoside Kinase) . 

GEKFMLMMLWMI IG ITGGSGAGKTTLTQNIKE I FGEDVSVICQDNYYKDRSHYTPEERAN 
LIWDHPDAFDNDLLISDIKRLKN^IVQAPVFDFVI^NRSKTEIETIYPSKVILVEG 
FENQELRDLMDIR I FViyTDADERI LRRMVRDVOEQGDSVDC IMSRY LSMVKPMHEKF I EP 
TRKYAD 1 1 VHGNYRQNWTN I LSQKI KNHLENALES DETYYMVNS K 

CPn_0736 827731 825992 

ygeD-E£flux Protein 

RG E LJ1JCLARCC LVAFMTVSVKK KS FRALVTTH FLT 1 1 NDNLY K FLLAFF LL EGKTLTENA 
KILSCVSFFFALPFIXLAPLAGSLADRFOKRNI ILATRF I EI LCTILGTYFFF IQSWGG 
YVVLIIJ^CHTTIFGPAKLGILPEMLPSEQLSQANGIMTAATYTGSIUSSCLAPL^ 
HRLGVNSYVWPTLMCVIVS I ISTLISFCIRPSNVKNVKQKITLVSFKDLWKVLKDTRMIH 
YLWSIFLGSFFLLIGAYTQLEIIPFVEFTLKYPKHYGAYLFPIVALGVGTGSYI 
GKDIKIGYVPLAAIGIiALVTMGLYAFACSILFVLFFLLAIjGFLGGVYQVPLiiAYV^ 
EHKRGQ IIAANNFLDFFGVLVAAGVI RVLGSNLGLS PETS F FY I GWFVLAVS I WTLWIWR 
EHVYRU^IILFJ^QI^YYI^IHQSSSPKCYFVAV^ 

Q KLV PGWRAWIX, SWCVPTVVS S VRDN D S EAQ DAWA VLQ ANH LKT S LKKF P DVS WC LGL P 
KNVERFTS ILQEQGIDLHP IQLVQKEGKKRVI YTLVFPHA 

CPn_0737 827469 830756 

"recC-Exodeoxyribonuc lease V, Gamma" 
KRSAKLPASGASKFJ<GRAKKKLTQERIFAFSVRVLPS^ 

SALNDFFLTETVMNATKHCRASFSNS PRHLI^QLAEDITSTHOKPFTKRWILVANATTGH 
WIKNQLVHVLSDHIFMGSTIFTASDS IVKHLFLGSGCSQPNI PDYLTLPLLINNILEEIS 
KASKFENGREFLSPPTYETTKKLAAAFKQFHTFSQRPTKNASHYQELFQILESHFSSYEE 
MFTTILNNRTOEEDCSLH I FGYAHL PKHLAEFF INLSTYFPVYFYCFSPCREYFGDLLSD 
RAIDFFT^QLPDSPIKNAWEHYV^SDROALLANLAHKSOSSQNFFLDREIDYQEMFLPSK 
HDS SLGVIQNS I LDLKPTS PODFSQTKQTICI YRALNI PREVQEVFCKVTELLHRGVSPE 
EI F ILSSHI ESYKVHLNAI FNPHVPI YFTDEVDPRAEDLRNKILLLSS ILQTQGDLHYIL 
QLLTHPQEXX3PI DONKVPYLIKKLSSEWGKISSKDRASGQOMKALGDLI LEEYPFHQEGG 
RVSQVEVWETTV PL I Y F I Q ER INLYL S S SQHS Y EDLFQNVFSCLEK I FVLS PEETS F I TT 
LRNSLFPTFATSSCSLLFFTDFCLDFLLHFHKPSPLYDKPGPYIGSLSSLSLIPKGYVFI 
LGANKTTSSD I FDLLNRTTTHEELAFSSTEDEENFH FLQ ILVSTKHELH I SY I SSAAQFN 
LPSPFLNHIKETLDLPVETLPTQPYLSAFFKNKACLHTSQEYNYSLAHAFYSKKALLPSL 
F I PTVKQVNLPQHLSLNEI IKGIFSPLDLFLKTNYNLRISYPEHLKKQQKLFPTKHQIED 
FWNECFVDKEHDLIPS ESPHAEELFTYYREKTILLRNGLDKDPKHSPYTVTFSSS IFEER 
PYHECYLFPPLSLSFCGNPVQIHGTIHGVCNEGLYLCSIDPRDSLKKTTRTLGSLPETSS 
EOKOLLERYVALAVLOMSQHLSSDSALIKLTSFNTKENHHPPFSDPEGYLRKVLEVYHLM 
SSQP r PLLS P LC WKT LDDE £K FHQA VL3 A I SEEAKNPSLPI FWQFHNRN I EEILNHVGAS 
ERLKILSLFRGPCEAV 

CPn_U738 830719 833895 

TecB-Exodeoxyt lbonuc Lease V, Beta" 

KFYLFS EVPVKPFN r FDSNSS IQGKFFLEASAGTGKTFT I EQ I VLRALI EGSLTHVEHAL 
A I T FT NASTN ELKVR IKDN LAQTLP ELKAVLNCO PAS L PTY LDINC MVKQ I YMQVRNALA 
TLDQMS LFT IHGFCNFVLEOYFPKTPL IHKNPALTHSQLVLHH ITNYLKQDLWKNVLFQE 
QFHLLAVRYN ITlJKHTiJCLVDKLLAGYTOP E^GYFSSRVERLEQ ISLWHOQ IYNSLLEI P 
KQVFLD0LTAHISGFKKQPrJILDDLHHFVDLLrrSETH33LFSFFKIAETFNFKHRLAR 
YKPf v AArrVLENMf«^*CRTLFJ"CNLDR IFNTLLVDLOEYLKQNYTPWLSPDESVFALEKL 
LJnnEAQP\A,'OALRFOYOLVL T DrFODTDKOO'Wj T FfJNLFIS PKFTGSLFL IGDPKOSIY 
b>fR:;ADLt v rYLTAK:^-r';i:DKOLOLVrJWRf;Tr-KLMEAinQEFGKI.*:PFLCrPCYLPIEY 
HALNPQ^lJETFENrrH \P I HrFFYCT EKLXJALW I FTEALPLQKEQK [ r^NMWLVSDSN 
(JAFEL [ ''YAT I PV:'P:*KNKS [ FHLTFTi f I LTT/.LLEAX LHPEN'^'EK [L'K I LF^.'ILFGLnL 
!)KVTTKKEPr*T E Yf 0''LH:!Y t : *HH f JLLATFYE^'/MTTCXJir/LF:]; JPPCIPL I FOEMEKLCGY 

[,ivr[:;:;YPYHOLLMLKNF:;rri -wweli 11 .a e:j;> /.'jiiDLETLK ett if c iSKt h.cydevfcpg 

[EK r jKKNKS:>^EL£.REMYVA('TPAKKOLYLI' [ "TQPPrjLOPijlIALTNYVKLEGTQSSAYD 
LA [HLHyEUPDLFliYi'LI'KDHClHA'n'VLNLl'LLLTFALK'/TPPKT E FSFSSTKFLLE7TI1K 
D::0 [ ;f[ J Y:;KLlU:;KCVLP[,!Ur:KT^[f.rHK [Lr';rOI^';LI J Uf7l , EYtW. t ; , riMRFEKHTHLEG 

FErrr ELKLL,SKTFF''rE:rF:::'oT c, ';r/;fjvi,r'riK[FRnT';rLFLCNfjK[ J wcv;vEDLFFEHE 
< ; k y y r t nw k t : f u ; i-- i*n i ; d y s k 1 1 r . : ; i y i wj \ y t xjY or ; t * r ka v f ' k r i .uq f ■"■ n e e^ddve e , 
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OV r F U J ( l Lt'njCNGFFALNS^ED I PNKNPKA I'JKCOAYH 

CPn_(J7j'» H14H'J2 H 33361 

<7T ih« hypothetical protein 

CKVLFKLMGY3LRNKKTKrcV , i p riIALGILSFR3IPQEVYDKIR3SFVSLHVKFFPKIKQ 
APGGHLANLELENLVLKERVAGLEEKLKLYEVSNHTPPLFPEILTPYFHKLVEGKVVYRD 
YTHWSSSCWVNVGKTHG I KKN3 PVLSCWLVGLVDYVGEHQSRIRLITDVCMKPSVVAMR 
OOrOSWWTKH.nLRELrPOVEOCSHAYTLEKDKYEKISCLQELDSLrQGEGENOALLRGTL 

:;■/■:. ;Ar.wi'i. r./M' " •? - ;;<TLr <('.. ™r.* r Li«'vrpr'G:,' /apvtv/f .-.= po 
■ ;ai ."!•!•■[•' :\-:r,.:\.t lfi.mlt.l l.- :LFT:.r-.r f i t n-,--? ["gllwl 

CPn_0740 836054 334864 

tyrB- Aroma tic AA Aminotransferase 

SYMSFF^IPTFSPDAII>3I^NWFADKRPEKVNLVIGVYEHP0KRYGGL^CIRKA0TVI 
LEEEQNKSTLPISGI^IFL^EMRELVFGAVDPSAIVGFQSLGGTGALHLGARIjLSVAKGS 
GKVYVPEQTWSNHIRI FSQSGLEVI RYPYYSKEQKQLLFEPL IAFLKEVEKNSVILLHGC 
CHNPTGVDFTEDMWKELA I LMKEREL I PF FDTAYQG FAHG I ELDRKP I E I FI SEGNTVLV 
AASSSKNFALYGERVGYFAVHSTFTDELVKIHSFLEEKI RGEYSSPQRWGVE IVSTILSN 
PYLKEEWQSELNFIRESLGK^TRFVQAIJ^KVAGHTFDFLLSQHGFFAYPGFSDKQVLFL 
REQHAVYTTAGGRMNLNGITEKNI DHWQSF IQAYEL 

CPn_0741 638383 836185 

greA-Transcription Elongation Factor 

EY I FRLKTGD I VDYLEKLQVLI EEGQSANFLSLWEEYCFNDVVRGRELVEILEKVKSSSL 
ASLFGK IVDTWPLWEKIPEGKDKDRVI^LILDLCTSNSQMFFDIATEYVNKKYSGEENF 
NEALRWGLRDG RDFQFSLSRFDFLMHMH KGNFVFKQGGWGVGEVMGVS FLQQKVL I EFE 
GIMSAKDISFETAFKSLTPLSGDHFLSRRFGDPrcFEAFAKENPIEWEILLRDLGPKTA 
KEI KDELVDLVI PEADWNRWWQSAKTKIKKGTRI I S PDNPKEPYVLSDAGCSHMGQLERK 
LGLSLNSAEKISLIYHFIRDLHSELKNIEIRKSLVKAI^DI^VEEGNKSLILQRELLLSE 
YLGIKDAS IDKEYITSLSEDDTSRLLENMP IVALQKSFLSLVRKYSSFWQQVFMQrLLYT 
TS PTMRDFVYKT I KNDPSSVEVLKKRLIJI>SAHQPMMFPELFVWFFLKLGNHEDGLFDPED 
KEVL RL FL E SALNFMYQVAST PH KELG KKLH HYL VGQ RYLAVRQM I EG AS LP FLKELLLL 
STKCPQFSSSDLNVLOSLAEWQPTLKKHKSNVEEENVL 

MVDNAKE I EDARS LGDLRENSEYKFALEKRARLQEE IRVLSEE INRAR I LTKDLVFTDKV 

GVGCKVTLKGDAGEVVEYTILGPWDADPDSCILSLQSKIA 

ISRIQSIWEEHGA 

CPn_0742 838442 338688 

CT635 hypothetical protein 

TKMMVIVMNSKSAQKIIDSIKQILTIYNIDFDPSFGSSLSSDSDADYEYLITKTQEKIQE 
LDKRAQEILTG^GMSKEQMEWAT^PDNFSPEEWLALEKVRSSCDEYRKETENLINEITL 
DI|ffiBTKESKRPKQKLSSTKKNKKKNWIPL 

C^g0743 838956 340362 

"nipiA- Ubiquinone Oxidoreductase, Alpha" 
IE^ITVNRGIJDLSLC^SPKESGFYNKIDPEFVSIDL^ 
lA^rKHFPNTYITSHVSGVVTAIRRGNKilSLI^^ 

e efrenglfal i kqrpfd i pai ptct prdvf inladnr pftp s pekhlalfs sreegfyv 

fw<5vraiaklfglrphivfrdrltlptqelktiahlhtvsgpfpsgspsihihsvapit 

nekea/vft lsfqeivltighlflkgrilheg 

lt|disdndtlisgdpltgri£k}<£ee 

tk^tylsgffkkkrtytnpdtnlhgetrpiiot 

ne|||flevcgedfalptli dpsktemlt ivkesli eyakesgiltphqd 

CP|C|3744 841387 840389 

hemB- Porphobilinogen Synthase 

EMS S LT LSRR PRRNRKTAA IRDLLAET HLS PKDL I APFFVKYGNNI KE E I PS LPGVFRWS 
LDLLLKEI ERLCTYGLRAVMLFP 1 1 PDDLKDAYGSYSSNtPKN ILCHS I HEIKNAFPHLCL 
ISlCXSALDPYTTHGHDG IFLNGEVLNDESVRI FGNI ATLHAEMGADI VAPSDMMDGRIGY I 
RSI^3SGYSKTSTMSYSVKyASCLYSPFRDALSSHVTSGDKKQYQMNPKN\^EALLESS 
LDEEEGADILWVKPAGLYLDVIYRIRQOTCLPLJ^YQVSGEYAMILSAFC^GWLDKETLF 
HESCJAIKRAGADMIISYSAPFILELLHQGFEF 

CPjyp745 841903 841742 

No" robust homolog present in Genebank/EMBL as of 11/7/98 
VDSJ|-DDWRAS3LCGSTTYWAYDPKHTLAYGFCNQVSVKKFHLKPPKSQEKFL 

CPrj£|746 ' 841939 843567 

CTS3'2 hypothetical protein 

FSGRC P FS FEVFMLGKEEEFTCKQKQC LS H FVTNLT S DVFALKNLPEVVKGALFS KY SR S 
VLGLRALLLKEFLSNEEDGDVCDEAYDFETDVQKAADFYQRVLDNFGDDSVGEIJGGAHLA 
MEWS I LAAKVLEDAR IGGS PLEKSTRYVYFDQKVRGEYLYYRDP I LMTS AFKDMFLGTC 
DF L FDT YS AL I PQVRAYF EKLY PKDS KTPAS A YATS LRAKVLDC I RGLL PAAT LTNLG FF 
GNGRFWQNLIHKWHNI^ELRRI^DESLTEI^KVIPSFVSRAEPHHHHHQAMMQYRRAL 
KEQLKCLAEQATFSEEMSSSPSVQLWGDPCGIYKVAAGFLFPYSNRSLTDLIDYCKKMP 
HEDLVQILESSVSARENRRHKSPRGLECVEFGFDILADFGAYRDLQRHRTLTOERQLLST 
HHGYNFPVELLDTPMEKSYREAMERANETYNEIVQEFPEEAOYMVPMAYNIRWFFHVNAR 

ALQWICELRSQPCGHOm'RTIATGLVREVVKFNPM-/ELFFKFVDYSDIDLGRLNQEMRKE 
PTT 

CPn_07 4 7" 843<M9 844053 

CT631 hypothetical protein 

RTCMGC KG A EVQ I LSSRSLSGMK I LSSSLFYKKFC 

CPn_0748 8449*6 844121 

LBpA-Gerjnyl Transtransferase 

OTLVLHALDTYRPSIESAIEKALEGFGPrGHPrRSPVEYALQGGGKRLRPGLVCMMAQGL 
GLNHDVMDSALAVEFVHTiiTLIADDLPCMDNDDERPGRPTVHKAFDEATALLAGYALIPA 
AYSHLRLNAKKLKEQGCDPREIDIAYNIIGDITDKriIGCSGVLCGOYDDMFFGNRGQEHV 
OSTMIKKTGSLFEIACC^GWLrGGGDPOFAPr rTCF^NNFGLLFQ [ KDDFSDLQKDSQQI 
<M.NYALLFGEKAALELIARC0^rLELLDRLSAGGLK^GSEFETrtC^LG i 3F 

CPr_U74'i K4'^3H rtl^OOn 

(I lint I UDl'-tUcNAc lYir>phi)*:phoryLai, t » 

VGYMTYE.A: t F'.', I 'f'DFE.Yt 'E I li'KAHYTWD I LPLML/jMLHNnVK^r : [ I iGTVEGGVTLKN 
[EKIETAEPAYV E.';< JAY TVf ~,rc rUj::uTLVRMCAf LPGNVH^SRrWGHCTEIKNSYLG 
I fflTKAAl IFAY1 ID^VL;': U';VNLGAGVRt\'\NFRI.DTIPN lYVPi'TSDK.^KK [DTGRRKLGAF 

\t ;k< ;va [fkwviNhviirt.piiTR tpr* ;ove 

' "l-n_r)VM) H4t..| t ,S «4'W<>' 
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tctD/cpxR-HTH Tt.ini" r ipf Lonal Regulatory Protein * Receiver 
Doman 

KITDFILRIHSYNLFCFHMICDK t tLFVTEDLSLoSQLKDLASORGDYQILVSPVFPTSF 
ESVAIFCEYLLLPEQIFSPGIFPEEDLIVLFDTFCEEAITKVLNQGATGYLLRPITAKVL 
DAVIRAFLRQHEVLEHS I PDTMTFGDHTFRVLNLV I ES P EGSVYLT PSEAG I LKKLLINR 
GHLCLRKNLLAE IKGNTKE 1 1 ARNVDVH I A3LRKKLGPYGSK I VTI RGVGYLFSDDDSIP 
LQNHDNTAHPNEE 

rn^ : ' ' V^giy-^S 

MFRCILFGIFLLTCFb'SGGVLYYLFCSHDF3IGPKEKSRSVWIEEEKEFTDSVLHHLPS<2 
HQHLHILCFQGFLLQKQQKFSQAEKIFSKVYDEAQDGPFLFKEEILGSRLINSFFLEKTD 
VMETILCLLNQRC PNS PYYHLFKALVCYKQKL YREV I EQ LAYWQEEK7RALA PLLN I S I E 
QLLTDFLLDY ISAHSL I EQKMF PEGRVILNRN INRLLKHECEWNAKTYDRI AILLSRSY F 
LELVESKSAD I Y FDYY EMVLFYLKK I Y I L EQC PYAELLP EE ELVS L I MEHVF ILPKDKLY 
PLIQLLEMWQKHYVHPNSSLWQ ILVDRFSTHMEGAIRFC EALVSFSGLEELHQQ I ITTF 
EELLSNKVQQ I KT EEAKQCVALLH I LDPS I S I S EKLALS 5 DTLON I VSG DDEQHTKLRNY 
LDLWEAIQSYDIDRG^LVHHLVYGAKDLWKKGGNDEKALNLLQLVLRFTSYDrECESV^ 
LF IKQAYKQALSSHAI ARLLKLEKF I SEANI PSIVX SEAEKANFLADAEYLFAHEDYDKC 
YL YSMWLTKVAP S PQ S YRLAG LCLMENKRYD EALE F LCM L S PNDS INDYKTQKA1AFCQK 
HQSKDRAAS 

CPn_0752 848595 350082 

"recD-Exodeoxyribonuc lease V, Alpha" 

GWALHTEFAPFLEDLVHQQVISPLDIAFASKHlSSDFEESFVFLAVSSALWRYGHPFLSL 
EENRIRPSLGG I SETDLYRGFHNLPKEARDKLFVWSGRLYLRSLYTI RSKLLDKLSLLC 
SATPWFPPSIDSSILSEEONFIFmiTC^FSrVSGGPGTGKTFLAAQLILSLVKQQPK 
LRIAIVSPTGKATSHIRQILMKYNrFDDMVt^GTVHHFLQEYAYRRYNSIDVI^ 
VTFDIiYSLVC/TI^GYEKDKKLYTSSLIIIXjDTNQLPPIGIGVGNPLQDLIGYFHENTFF 
LKTSHRAKTGWDQLTQSVLRGEMISFSPLPS I SSAI EVIJCNRFVKSl^QSFARLCVLTP 
MRHGPWGVLNI^miIKQRIJ\RSDPDLRrPIMVTSRYETWGLF^[^ 
HEPtDSRALSQYVYNYVMSVHKSQGSEYDEVIVI I PKGSEVFGVS I LYTAITRAKYRVSV 
WGDPETLHKI IKKSNY 

CPTU0753 351009 850161 

No robust homolog present m Genebank/EMBL as of 11/7/98 
IMATAHLGRQALLHLRSWTPAI RASGNLFRQQ SMSLHNNVL FAGD I VGA IKNSTA I SRHA 
I^SSHYAHAAUJKTEGFLGAADGVOTAVAGAMLWGQLLNGSMI FETDEETGELRRCNEAD 
AEGCMTQKLQRRSALT ITGKVARLASKTLGTATFLHEMDVVS LGANANK IGCKVTSCLNL 
VATGCSLTESSrSLYRILSTRPETISDPENRNKPSAEFAARSKAIRNAFIAWLGDWDLV 
CDALGT LS LFLPA I EjGVHAVL IMAI LGL I S CV I NFVK DY AK I G 

CPn_0754 851381 851040 

rs20-S20 Ribosomal Protein 

QFIL^ILKVLVLSGDI^^APKKP^nCK^IVIORRPSAEKRILTAOKRELINHSFKSKVKTIVKK 
FEASLKLDDTQ AT LSNLQ SVYSWDKAVKRG I F KDNKAAR IKS KAT LKVNARA S 

CPn_0755 851579 852799 

CT616 hypothetical protein 
YKDLFFMIXVRKWUrcCFKYWIYFLPVVTIIX^ 

F F ALRRRENQLKT AA VQ LLQTK I RKLT ENNEGLRQ I RE SLKE HQQ E SAQ LQ I QSQ KLKN S 
L FH LQG LLVKTKG EGQKLET LLLHRT E EN RC LKMQVDS L I Q ECG EKTE EVQTLNRELAET 
LAYQQALND EYQ AT FS EQ RtML DKRQ I Y I GKLEhfKVQ DLMYE I RNL LQ LESD I AEN I PSQ 
ESNAVTCNISLQLSSEIJ<KIAFKAENIEAASSLTASRYLHTDTSVHNYSLECRQLFDSLR 
EENLGMLFWARQSQRAVFANALFKTWTCYCAEDFLKFGSDIVISGGKQWMEDLHSSREE 
CSGRLVIKTKSRGHLPFRYCUlAIJaKGPLCYHVLGVLYPLHKEVLOS 

CPn_0756 852889 854676 

rpoD-RNA Polymerase Sigma-66 

ISYLPLTKI^SSKARNPLVLFQVRKLFMOTQNSQATEVSSEEESQKKLEELVALAKEOGFt 
TYEEINErLPMS FDTPEQI DQVLIFLTGMDIQVLNQ IDVEROKEKKKEAKELEGLARRTE 
GT PDDPVRMYLKEMGTVPLLTREEEVEI SKRIEKAQVQI ERI ILRFRYSAKEAI S I AHYL 
ISGKERFDKIISEKEVEDKTHFLKLLPKLITLLKEEDTYLENLLLSLKQPDLSKQEAAKL 
NDSLEKCRIRTQAYLRCFHCRHNVTEDFGEWFKAYDSFLHLEQQINDLKVRAERNKFAA 
AKLAAAKRKLYKREVAAGRTLEEFKKDVRMLQRWMDKSQEAKKEMVESNLRLVISIAKKY 
TNRGLSFLDLIQEGNMGI^IKAVEKFEYRRGYKFSTYATWWIRQAVTRAIADQARTrRIPV 
HMIETINKVLRGAKKLMMETGKEPTPEELAEELGLTPDRVREIYKIAQHPISLQAEVGEG 
SESSFGDFLEDTAVESPAEATGYSMLKDKMKEVLKTLTDRERFVLIHRFGLLDGKPKTLE 
EVGSAFNVTRERIRQIEAKALRKMRHPIRSKQLRAFLDLLEEEKTGTSKVKSLKSK 

CPn_0757 854709 855134 

folX-Dihydroneopterin Aldolase 

PCIKNIALVIArERYQLIISKFRMWLFLGCSVEERHFKQPVLISVTFSYNEVPSACLSDK 
LSDACCYLEVTS L I EE I ANTKPYAL I EHLANELFDSLVI SFGDKAS K I DLEVEKERPPVP 
NLLNPIKFTISKELCPSPVLSA 

CPn_0758 $55104 356459 

folP/dhpS-Dihydropteroate Synthase 

RAMS EPRFVCLS LG3NLGNRFKNLQ I ARTLLGEQAVLGLRSSVI LET EALLL PG S P PEWD 
LPYFNSVLVGETTL3LRELLVTIKQIEKWGRAEESPPWSPRTIDVDILLYGDESFCCDH 
TEITIPLSNLLSRPFLIALIASLCPYRRFCTQGSPYHNFTFGELAHHLPSPPGMIRRSLS 
POTMI^GVVWVTNDSMSDGGMFLDPEKAVAOAEKLFTEGAAVrDFGAOATNPKVKQFLSV 
DQEWERLEPVLRLLKETWSNRKQYPI ISLDTFYPEI ILRAMDIYPIQWIKDVSGGSQSMA 
EVARDCELSLVMNHSS3LPVDPKNILSFSVPIGE0LLSWGEKQLKMFSDVGLNANQVIFD 
PGIGFGKGAAQSLATLYEIAKFKRLGCPILIGHSRKSFL3LFGNMDPKDRDWETVGLSIL 
LQCCGVDYLRVHNVAAHQKALSVAACEACAPI 

CPn_0759 ?5t>4"M 856°'>7 

folA-Dihydroto tart* Reductase 

LLVKPVEfPCNFENPLGVEMCKNFGVPG [VACDPRGVrOLEGKLPWHYrEDLOFFSETIQK 
FP n/MGRKTWEn , L.PPKYFVDRA , / , >/VFr:HEKRO(^VHGE I WVTSLFEFLL,I>DL. l )i'PTFLICjG 
GELY:;LFLENOIVPDrF[SHrKKEYA^EyrFFrr J :;LLCTWTKTVI.Rn , ruK rTTCYYENfiHS 
(JfJTKNISL 

t Pn_07r.fl *^<,- t t, i ; (l .i4 

t.TMl hypDrhfrif.il viftr^m 

RHGPKLCLErpKR.'/jPVTMKITr/KTI-KrYrYDDLY^tLESSr.rKt.NFP:; rwrTSKIVfi 
LL FGA WF.LHK V'jK DELI KOEv\DA YVI-VCK Y< "> lYLTKKWGILf P: :A^, I UWMVFL lYFVLY 

ppdf[.lsvntu;lwlrnf\ uLr.iKGE i rr;iJ:-nrppLHPGTMGu;uw ;ffpi .ynyw ;kp 

IX:FGRALKMTYSNLLDGLSAAAVI/ W,l'/]r)EQ'VlLA[ IEEAF'K ITFIl:;:jPTrL^.U)M:*TLA 
[ AFUEDLYGPLLOwMAWETPAE'T 1 '; 
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Cfn_07M HS7h-lfl 858375 

CThLO hypothetical prorpm 

H I MT.'JW r E LLDKQ I EDQHMLKHEFYQRWSECKLEKOQLQAYAKDYYLH IKAFPCYLSALH 
ARCDDLQ I RRQ I LENLMDEEAGNPNH I DLWRQ FALS LGVSEE ELANHE FSQAAQDMVATF 
RRLCDMPQLAVG LGALYTYE IQ I PQVCVEKI RGLKEYFGVSARGY AYFTVHQEADIKHAS 
EEKEMLQTLVGRENPDAVLQGSQEVLDTLWNFL3SFINSTEPCSCK 



EELLYFSRFSEKQNYSLGNMETKRS I YMNLPDRKKALEAAVAY I EKQFGAG3 IM5LGRHS 
ATHEIST IKTGALSLDLALG I HGVPKGRV I EIFGPESSGKTTLATH IVANAQKMGGVAAY 
I DAEH ALDP SYAS L IGVN I DDLM I SQ PDCGEDALS I AELLARSGAVDVI VI DSVAALVPK 
SELEGDIGDVHVGLQAR>KSQALRKLTATLSRSG / rCAVFINQIREXIGVSFXjNPETTT^ 
RALKFYSS I RLDIRRIGS IKGSDNSD IGNRIKVKVAKNKLAPPFRIAEFDILFNEGISSA 
GC I LDLAVEYN I IEKKGSWFWQEKKLGQGREFVREEIJCRNRKLFEEIEKRI YDVIAANK 
T PSVHAN ET PQ EVPAQTVEA 

CPn_0763 360520 859972 

ygf A-Formyltetrahydrofolate Cycloligase 

NFPMTDPKI EKSALRKLF IS IRRDLSEERKHEASSAVASFVRSFSKESWLSFVSFNHEI 
DMCEANR I L IQKCTLALPKIDQENLYPVL I PS IDDLISWHPKDPFSKCTPISSDKITHV 
LVPGLAFDQOGYRLGYGHGFYDRWLAQHPYPS I RT IG IGYCEQKIDRLPQESHDI PLSQI 
YLC 

CPn_0764 861819 860524 

CT648 hypothetical protein 

GYKSMDIKKLFCLFLCSSLIAMSPIYGKTGDYEKLTLTGINIIDRNGLSETICSKEKLKK 
YTKVDFLAPQPYQKVMRITCK^RGDNVSCLTAYHTNGQIKQY^ 

GN I K I QAEV IGG I ADLH PS AESGWLFDQTTFAYNDEG I LEAAIVYEKGLLEGSSVYYHTN 
GN I WKEC PYHKGVPQGKFLTYTS SGKLLKEQNYQQG KRHGLS I RYS EDS EEDVLAWEEYH 
EG RLLKAEYLD PC/THE I YAT I H EGNG IQA I YG KYA V I ET RAFYRG EPYGKVT RFDNSGTQ 
IVCTTNLLQGAKHGEEFFFYPETGKPKLJ^^ 

KSGLLT I YYPEGQ IMATEEYDNDLLIKGEYFRPGDRHPYSKIDRGCGTAVFFSSAGTITK 
KIPYQDGKPLLN 

CPn_0765 862415 861801 

CT647 hypothetical protein 

TT IY IKLLGRLMKKWI S I L I LS FLSLLS I LPVLAIT INHVKI SQRWSDLNSQ I LTLKVTR 
DHEDQVIKHNARISKDRNNLSIESLNASCKQLRPLSKERERI^^ 

KRiAL EK SNHQ LVWNCE QMHNDF AFVRLEQATEMDNED I E SLFSLFN PEN PVAPLVF FTCW 
KWT^QTTPLGNEVWLTHAEAISRWI 

CP3l0766 863785 862394 

CT.jI.46 hypothetical protein 

Af^KLPVYHIGLTKAEbOTIKIAILQKTCKGWIVCHCEQIPEGKTWSLPKKYFAAPTTF 

SI^SDILVKSSSSSLK^KNILKVALTNLEASIJ^PWESLIVQP^ 

W^QKOTLKKELSFLSQAQIFPDKLSCRAADIFFLAEQSPIJCSLPAYLLIYGGSEEVTCI 

FViaJ^IAVARSFSNHSTKKSCDDIHATLQYIQETFPQTVLPAIHVAQISPNLQKILEQK 

L&LPLWCQSMTYGVEDEDWE I YGDT I AAAHHGASRRPLTFPYDAT SVS PAAQKHWLLRS 

SlOGKYALMATVWSL£SVIJU^ 

KN£^NYPLLPTIPTSEQTLKFLLAI/3KSSPSIKFSYFSYTMTSYPSKDNPSLPYSALV^ 
VKgISGOPED I PQFLKKISSHPPCLQHVSESLEDQRSFKLQFTLSS 



CPH_:0767 863878 864177 

CT645 hypothetical protein 

N IMLSYLLRTAINVYS FL I LAY I FASWVPDCQ SARWYQLVSKCVDP FLNFFRRFVFRIGF 
Idt^PFVGLLCLGILPFVILRVLRF 1 1 LNI FHS PWLLQYL 

C&rk)763 864144 865163 

yotU/nir3 -predicted oxidoreductase 

YF^MAAPIFI!WILLRSSIVVAPLAGFSDYPYRCMSALTOPGI^FCEMVK\TOILy^ 
ERTSJCLLDYNENMRP IGAQLCGSNPETSGEAAK ILEGLGFDLIDLNCGC PTDKITKDGSG 
SGLCKTPEL IGR I LDK I INSVS I PVTVK I RSGWDMEH INVEDTVR I IRDAGAS AVFVHG R 
TRA0GYHGPSKQEYISRAKAAAGKEFPVFGNGDIFSPEAAQAMLTTGCDGVLVARGTLGA 
PWl'ffltQ IQDYLTTGSYEKI PFIKRKAAFLEHMRLVEDYYQSETKFLSETRKLCGHYLISA 
AKYjRJfL RS S LAKAT SYQEVYQLVNDYE EADDS S LETFVKC 

CPn_0769 867763 965121 

topA-QNA Topoisomerase I -Fused to SWI Domain 

SrC<2PHAIRLMKKSLIIVESPAKIKTLQKLLGSEFVFASSIGHIVDLPAKEFGIDVDHDF 
EPQYQVLPDKQEVINHIRKLAAKCEKVYLSPDPDREGEAIAWHIANQLPDSPLIQRVSFN 
AITKNAVTEALKHPRT IDMALVNAQQARRLLDRIVGYKI SPILSRKLQQRSG ISAGRVQS 
VALKLWDREKAI DAFVPVEYWNLRVLMQDPKTTKTFWAHLYAVQGKKWEKE I PEGKTEN 
DVLLINSEEKARHYAELLEKSSYTITRVEAKAKRRFAPPPFITSTLQQEASRHFRFSASR 
TM3 rAQTLYEGVDLDSEDSTGLITYMRTDSVRVDPEALTTVREY IQQTFGKEYLPEKANV 
YTTKKMTODAHEAIRPTDINLTPDKLKNKLSDDQFKVYNLIWKRFVASQITPAIYDTLAV 
QITTDTEIDLRASGSLLKFKGFLAVYEEKODDENDQEEDHPL.PPLHAQDALIKEEVSQEQ 
AFTKPLPRFTEASLVKELEKSGIGRPSTYATIMNKIQSREYTTKENQRLRPTEUSKIISQ 
FLETNFPRIMDIGFTALMEDELELIADNKKPWKLLLQEFWTTFLPWITAEKEAVIPRIL 
TN I EC S KC HKGK LVK I WS KN S Y FYGCS EY P EC DYRT S EEELAFNKEDYAEDT PWDS PC PL 
CGGVMKVRHGRYGTFLGCEKYPECRGTISIHKKGEEIEQEEPIPCPAIGCNGKIFKKRSR 
YN K I FY SC S EY P ECS V I GN S I DAV I TKYSGT E K I PYKKKT PTKKKS SAKTTKAAKT PSKK 
CKAKSSVKKSSEKKTGPLFLPSPDLAKMIGNEPVSRGEATKKIWDYIKEHQLQAPENKKL 
LVPDNNLATI IGPNPIDMFQLSKHLSQHLTKV3NDESSASS 

CPn_T770 868322 St><H31 

^T f A2 hypothetical protein 

KPRTRNVEKLEFVTCLCSPDDDLITFNKQGLIAGPEEEKVAFLVRSNAMLDAGPETPASF 
(-ErJLREOFOrFPEWEVLYSNEGLDWEAGCTWIIJiliEVTIQLRKHHRKASPWLGMYSRD 
t:V^\HEAVHAVRMKFHEPVFEEVLAYOTSRWGWRRFFGPLFRSPGESYLLLFFTILGLGI 
: ; [ ,WY PAG I L , £ ML V L PM Y F LM R LCMAQ S Y L Y RAM K K I P KM LGV P P LWVL LR LT DK E I KMFA 
K E f • r P V L E H Y AR K R K L ENVRWKQ I Y Q 'Z Y FV 

« 'i-njvr; i « ais u ^ih.i 

tpofJ RNA r'olyini-f.i",)> ujma V* 

fl-T^N:;KRi;^U:;:;ALDMFOOKfJKLr;LKYLP r ;L,kMQC>GLOMLOSPLTELSSY 1 J f 'yQEIIDNP 

KFOi/;.';r.r-:t:i':EW::pf ypftn'Jtfhylnot pejPOE^L /tpllpq l eeafstaeerfiahqi 

A/;rn,:;i)I-;f ;r ^RNPGDFAOELELPLEKIHKVWPT EOIiLOT'EG CATIPSLOSYWMKLLRNSS 
Hf.'OAY:j IVF'IX'YPLMTNt'EFAP IMKKFjL'JL^CLRN [LKKALGC I PWCPAAACTVKPMVS 
TPLf '!') I '{\ YAViY T KVSTRf iLPS rKl.NKrTFHFYEHLPKEEQKNLSQQILSAKWLIK 



NLRKREQTLLQVMETLLPKQEDFLLGK I PAPY PLC I KDLAEDL J FHE.7T I FR A I ENKAVA 
AP rGIFPLKHLFPRGrHQDS^HSKE^A ^ f LOWI ROWTATEC/TFLCD.TV ITAKG t PCAR 

RTVAKYRAQLKtLPANKPKKLFYrRGoNoHFRDROF 

CPn_0772 872400 87046$ 

uvrD-DNA Helicase 

KLGLIMTC I SELNEAORKAVTAPLNPVLVLAGAGAGKTRWTYR I LHLI NQG I APREI LA 
VTFTNKAARELKEP I'/NQC ASTNEFDVPMVCTFHSLGVF TLRRS rNLLNRENNFTI YDQS 

\rj\:.i'F"f;[ i..\> :™''LLi'' akm.. "«"■<_: TtJMAC v T:^CLL-*Kcwp 

NV FAVG DPDQS I Y JWRGAN I HN X LNF EMDY PNAKV LCLEENY RSYGNI LMAANAL I KNNA 

SRLEKELRSVKG PGEK I RLFLGSTDREEADFVAAEI LQLHRVGNI KLRD IC I FYRTNSQS 

RTFEDALLRRRI PYEI IGGLSFYKRKEIQD ILAFLR I FI SKSDIVAFDRTVNLPKRGIGS 

TTIFALTQYAIAQGLPILKACO^ALDTKDVKLSKKQQEGI^EYljALFPQIEHAYlTrLSLR 

DFIESVVT^ITGYLEILKEDAXTTFECDRKSNLEELYHKALESEQQNPKTHLELFLDDLALKG 

SDDDLNLTADRVNLMTLHNGKGLEFRVSFLVGLEEQLIjPHANSLGGTYENI^ 

G I TRAO DLL YLT AAQ VRS LWGTVRMMK PS R F LK E I PK DYM I QVR 

CPn_0773 872485 873195 

ung-Uracil DNA Glycosylase 

FMQNATIDQLPVSV^EQLPLCWREQLKEEWSKPYMC^LLIFLKQEYKEHTVYPEENCVFS 
ALRSTPFDQVRWILGQDPYPGKGQAHGLSFSVPEGQRLPPSLINI FRELKTDLG I ENHK 
GCLQSWANQG I L1XWTVLTVRAGEPF S HAGKGWELFTDA I VTKL I Q ERTH 1 1 FVLWGAAA 
RKKCEIJ-FNSKHQHAVI^SPHPSPLAAHRGFFGCSKFSKI^LL^L^PMINW^ 

CPn_0774 873183 873425 

CT606.1 hypothetical protein 

LEAPMNEGIHSVCFQKTPRLTAKSWSMEMIXTTQQLPSAEGMPSVANLEADFLRAEALL 
AEMRE I RGCLEQSLRTLVPSE 

CPn_0775 874040 873414 

yggV family 

ERFMKIVIASSHGYKIRETKTFIJaiI^DFDIFSL5DFPDYKLF<2ECGDSITANALTKGIH 
AANHLGCWIADDTMLRVPAL^LPGPI^ANFAGVGAYDKDHRJC^ 
AYFECCVVLVSP^EIFKTYGICEGYISHQEKGSSGFXTYDPIFVKYDYKQTFAELSEDVK 
NQVSHRAKALQKLAPHLQSLFEKHLLTRD 

CPn_0776 874180 875487 

CT605 hypothetical protein 

fifvijcnfydcllmffqflsftmkkifysfvllscifpyvgcaqvfvgldrifse^ 
cicgkkialishsaainsrckjdalsvfysrkhdctveilctlehgyygatptetvgnqps 
ry pnlrsvs lygvkevpkevaehcdvfvydvq d i gvrsysfvtvlmq ivkas erygkql i 
vldrpnpmggrivdgplpnpttsgslai pycygmtpgelalffkktyapnanwvi pmkg 
wnrsmtfdetgl iwmptspqmpdpqs pffyaatgilgalsvasigvgytlpfkvlgapwm 
dgekvadeij^rmklpgvlflpffyepffgkyk>ie>k:sg^^v^ 
vlkalypkqvec^tjcsieriparrssicm.fggdeflsishkeryrvwplri^lckesres 
fhqlrsscllseyaes 

CPn_0777 875586 877178 

groEL_2-heat shock protein-60 

TS EDRWWVFKSQFEG LSALKRGVHALTKAVT PAFG PRGYNWI KKGKAP I VLTKNG IR I 
AKEI ILQDAFESLGVKIjAXEALLKWEC/rGDGSTrALWIDALFTC^LJCG IAAGLDPQE I 
KAG I LL SVEMVYQQ LQ RQ A I ELQ S PKDVL HVAMVAANHDVTLGTVVATV I SQ ADLKGVF S 
S KDSG I S KT RGLGKR VKSGYLS PY FVTRP ETMDVVWEEALVL ILSHSLVSLSEELIRYLE 
LI SEQNTHPLVI IAEDFDQNVLRTLIOJKIJWGLPVCAVKAPGSRELRQVVLEDLAILTG 
ATLIGOESENCEIPVSIJIJVLGRVKQVMITKETFTFLEGGGDAEIIQARKQELCLAIARST 
SESECQELEERLAI FIGSI PQVQ ITADTDTEQRERQFQLESALRATKAAMKGGIVPGGGV 
AFLRAAHAIEVPANLSSCOTFGFETLI^AWTPLKVLAQNCGRSSEEVIHTILSHENPRF 
GYNGMTDT FEDLVDAG I CDPL I VTTS SLKCAVS VSCLLLT S S FF I S SRTKT 

CPn_0778 877400 878092 

tsa/ahpC-Thio-specif ic Antioxidant (TSA) Peroxidase 

A PVAQSDRVPGY EPGGQRFES S LVRNNKRVEEEVFMT LS LVGKEAPDFVAQAWNG ETCT 

VSLKDYLGKYVVLFFYPKDFTYVCPTELHAFQDALGEFHTRGAEVIGCSVDDIATHC^^ 

ATKKKQGGIEGITYPLLSDEDKVISRSYHVLKPEEELSFRGVFLIDKGGIIRHLVVNDLP 

LGRS I EEELRTL DAL I FF ETNGLVC PANWHEGERAMAPN E EGLQNY FGTI D 

CPn_0779 878502 878095 

CT602 hypothetical protein 

RFDL I FCWKFTVALFG EAEKGS YDTAYFCRSLVDLHNYLGDVSS PG ITLAIKTLLSDYNV 
VYFRVREEGYCVDSYFFGLHFIjKTQTTLKNIIAIGLPGVGNQHI IEASRSLCQKHNSLLL 
FFDHDLYDLLTFNQPF 

CPn_0780 879241 878591 

papQ/amiB-N-Acetylmuramoyl-L-Ala Amidase 

HGNKIAVOSLRFMHAKLSFFILLSLLFSGIDCSRLHAAGRSPSLQGVLAEIEDISAKLAS 
HEVEIVMLS ERLDEODSKCOKWTAAKPETLAOK I RELES DQKALAKTLAVLTTSVKDLCT 
NLQSKLOEIOKDHR.ALAODLRLVRRSLLALVDSESPGAYADFSDPVPENIYIVREGDSLS 
KIAKKYKLSVTELKKINKLDSDAIYAGQRLCLQP-NKQ 

CPn_0781 879851 879198 

pa L -Pepc idoglycan-Associated Lipoprotein 

QNCYRSRRKTVPLLGCFPSATDKENTMtJIHSLWKLCTLLALLALPACSLSPNYGWEDSCN 
TCHHTRRKKPSSFGFVPLYTEEDFNPNFTFGEYDSKEEKQYKSSQVAAFRNITFATDSYT 
I KG EENLA I LTNLVHYMKKNPKATLY I EGHTDEPGAAS YULALGAR RANA I K EHLR KQG I 
SADRL3TISYGKEHPLNSGHNELAWQQNRRTEFKIHAR 

rpn_0732 831077 879773 

talk poiy^occh.i: irfe transporter 

( ItJ IfiMLROLCFOVFFFOFA^LVYAEELEVWRCEHITLP rEV^OQTDTKDrK lOKYLisoL 
TE I F< 'KD [AU1DC -I^rTAA^KCSSSPLAI SLR LHVPQLSWLLQ.S^KTPOTLC:; FT UIQN 
U:VDI'0K I MHAADTVMYALTG [ PC ESAGK EVFALSSLGKDQKLKOGKLWTfDYf> IKNtiAP 
LTTP- "A/.", ITPKV^\"\\'SNFPYLWCYKYGVPK IFLGSLEfJTEGKKVLPLKfiNOI.MI'TrS 
PKKKI.LAFVAI/I'YuNri^Li* I O^FSLTHCPMCP PPRLLNErJFGTQf'NP.'*FNPF(":;0[A/F TS 
NK£/.E'l F^l-Y I M: -t .PPFl VAPRLLTKKYPNSSC PAWSPD"JKKI AFCr:;V [K( jVR^If I Y!)I ,j ; 

*;< :n//oi;rr*:pTNKr :p:.wa i d:'rhlvf.';agnaeeselylislvtkktnk r a u ;vgi:kri' 

r-'IW^AI'T'LKlPIKKTr 

( Pn_0 /x i >i H I HUH HH I lOh 

CT'.'tH hyptii-h.-r u .i I pt nr.un 

TMMKYLPY [ALTAi'llkVU LLL.VKA. : Pf.PKKRLOPKAFOEKLVT rQPKF'E 'V! 'T! " ;VW1 jp 
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AKT E R P3VATQ PQ KQAKC3 P PQ ENVQ KA LQ K P I PKV I KT E P PK PS P A PTVAK KTTATEKP 
P PSTT K KNTQ L3 KTQLQT L5 EVAQAL 2 LHVDK I EK S ETS LKN t SWP3TAQLTMH S ELKAT 
QEDELCELFRTHrALP5KG*rVRIKLVL3PNCE:r0ECSFL3EVSAADKQLLT0RIQALPFQ 
KFLEKYKVSKNr3FHrKL73NEG 

CPn_0784 882359 881892 

exbD-Biopolymnr Transport Protein 

DR AD3 T FTT F. P f P ^0 r PQY K r .M KYP FT E E T EE E P LVNLT PL I D EVFV I LMA F I VAVP L I K 

tALAi-- ;-:vf.J vr, mil.'a/:' /. ,m>h ..y^jstt. . v\^:jt ;~lti.: mfai ! ' - 

- n 'LL.LUUJ :•;[■: :V'.riVK:iALr-\/ -t Mi,L!IVA„Vn 

CPn_0735 883039 882296 

exbB/tolQ-polysacchande transporter 

DHLYFE7LSVNKDFYSMVHFSHNP I IQAYTEADFFGKS I FFCLLILSVCIWTVLHQKLAI 
QKNFLKAGKSLKDFL I KNRHAPLSLD I HPELS PFADLYFT I KRGTLELLDKNRQS APDRG 
PI LSSEDIQSLETLLGAIMPKYKALLHKNSFI PATT ISLAPFLOLLGTVW3ILVAFTHIS 
SGSSGNSAIMEGLATALGTT I IGLFVAI PSL I AFNYLKAHSS ELISEIECTAYLLLNS I E 
VKYRNTNL 

CPn_0786 883X37 885293 

dsbD/xprA-Thio: disulfide Interchange Protein 

NHGVILNKFKTYLQTALI APFFSFPALSGSFS S IQAEEITQQVNHPGAELLSEGSY I PGL 

QTFRLG I KITASKGSH I YWKNPGEIGSPLKISWQLPKGFWEEEHWPTPKVFEEEGTTFF 

GY EDSAL I VADVRAP EGYT PGQ EVELRAQVEWLACG DSC LPGNVDLKLT LPYEEKE PSLY 

PDTHAEFTKTLH AQ PRVLENDH SVQVAQGKGNE 1 1 LN I S KK I NATKAWFVSEKADKLFAY 

AETSYSGGTGTAWRLKVKNLSGVQKNEKLHGILLLAI5HTGRPVESLTIHSEVLGQTGSAV 

AGLSQY IT ILIMAFLGGVU^IMPCVLPLVTLKVYGLIKSAGE^R^SVIANGLWF^ 

GC FWGLAGVAF ILKVLGHN IGWGFQLQE PMFVATL 1 1 VF FLFALS S LGLFEMGTMFANLG 

GKLQSSEMKSSNNKAVGAFFT^ILATLVTTPCTGPFLGSVUSLVMSLSFl^ 

LGMASPYLVFSVFPKMLSVLPKPGGWMSTFKQLTGFMLLVTVTWLW 

LLGGLWLAG LGAW ILGRWGTPVS PKKQRVCAS LL.FFAFLGGA I SVSGLASHYFAEPQQTV 

SWEDSLWQPFSLEKI^LRAQGRPVFVNFTAKWCLTCQMNKPVLYGDAVQKMFET'HGIV 

TLFJUIWrRKDPGITEELARLGRASVPSYVYYP^ 

CPn_0787 885604 886401 

yabD/ycfH-PHP superfamily (urease/pyrimidinase} hydrolase 
. TRRQFVDLADAHVHLSDDAFEED INSVLQRAQDSGVSL^ P 
K I RFCHVGGTP PQDVDQD I EEDYPNF HAAAHS KKLAAIGEVGLDYCFATESG IARQKEVL 
QRYLALS LECELPLWHCRGAFNCFFRMLDQYYHND PRSR PGMLHC FTGTLEEAQELISR 
GWF I S I SG I VTFKNAQDLRDLWELPLEHLL I ETDAPFLAFV PYRGKKNE PAHVLHT INA 
VAp^GMFPQEIJVALAYKNVLRFLHG 

CPJT30788 886521 887432 

sdfiC-Succinate Dehydrogenase 

SU^f^LRMSRHEICPEVSHKKGKYYSTFIFRCIHSLAGIAFTF 

QGKGFVAMVNG FHKI PGLKIIEVAGLVLPFLCHAI I GI VYLFQGKSNCY SGDGSRPHLRY 
AifesYTWQRWTAWILLFG IAFHWHLRF IRYPVHVDIHGTTYYAVDIQPSRYDVI VRGT 
KG£l2rLNLPNTEASSIEVSRHDLGGADAAL^ 
- AUi^rLVIAAAFHGF^LWTFCCRWGVW 

CPfg0789 887436 889316 

sdhk :L Succinate Dehydrogenase 

QM££!NRiCVI WGGGLAGLSAAMQLANLG I IVELVSLTKVKRSHSVCAQGGINAALNLKPE 
EEESP YVHAYtTT I KGG DFLADQ P FVL EMC LAAPR I IKMLDNFGCPFNRGPSGNLDVRRFG 
GTLYHRTVFCGASTGQQLMYTLDEQVRRREHAGRV I KRENHEFVRLVTDHSG RACG I ILM 
NLFT1>^LEIL£GDAVIIATGGPC^IFKMSTNSTFCT 

TANSRDKLRL I SESVRGEGGRVWVPGDSSKR I VFPDGSERPCGETGAPWYFLEDMYPAY 
GNLVSRDVG ARA I LRVC EAG LG I DGRKEAY LDVTH L P EKT RH KL £WLD I YK KFTGEDPN 
TVJ^IFPAVHYSMGGAWVDWPAADDPDRDSRFRQMTNIPGCFNCGESDFQYHGANRLGA 
NSLLSCLFAGLVSGDEASRFIEAFGASQATSSDFDRAU2QEKEENARLLSASGKENIFVL 
HE&I^IMVRNVTVKRitf^ 

LELAIAITKGALLRNEFRGSHYKPEFPERDDEHWLKTTVAWAPEEPEISYLPVCrrRHVA 
PTt-RDYTKSSTGKI ELTNIPDNIRLPI 

CPnM)790 889279 890103 

sdhS^- Succinate Dehydrogenase 

NSRiFFLI I SVYPYRKREMMENLETFILKIYRGVPGKQYWESFELPLHPGENVISALMEIE 
KRPVNI LGEKVNPWWEQGCLEEVCGSC S ILVNGVPRQACTALIQEYIDATQSRE IVLAP 
LT KF PL IRDLIVDRS IMFDNL ER I QGWVAAD I EGETFG PQVTQEQQ ELL YALSQCMTCGC 
CTEACPQIDNKSDFIGPAAI SQARYFNTYPGDKRSKKRWRALMGKGG I EGCGQAHNCVRV 
CPKKLPLTESISAVGREISKFSLRSLFSALFKKKK 

CPn_0791 893104 890111 

CT590 hypothetical protein 

TCLRSSRKI WED I S DRNMY SCYSKGI SHNYL LHPMSRLDI FVFDS L I ANQDQKLLEE I F 
CSEDTVLFKAYRTTALQS PLAAKNLN I ARKVANY I LADNGEI DTVKLVEAIHHLSQCTY P 
LGPHRHNEAQDREHLLKMLKALKENPKLKESIKTLFVPSYSTIQNLIRHTLALNPQTILS 
T I HVRQAALTALFTYLRQDVGSCFATAPA ILI HQEYPERFLKDLNDLI SSGKLSRIVNQR 
EIAVPINL3GCIGELFKPLRILDLYPDPLVKLSSSPGLKKAFSAANLIETLGDSEAQIQQ 
LLSHQYLMQKLQNVHETLTANDIIKSTLLHYYQLQESTVRAIFFKEGLFSKEQVAFSTQH 
P R ELS E IQRVY H Y LH AY E E AKS AF I H DTQN P LLKAWEYT LAT LADASQ PT I SNH I R LALG 
WK3 EDPH3LVS LVTHFVEEEVEN I R I LVQQCEQTYH EARSQLEY I EGRMRNPLNNQDSQ I 
LTMDHMRFRQELNKALYEWDSAQEKAKKFLHLPEFLLSFYTKQIPLYFRSSYDAFIQEFA 
HLYANAPAGFRILFTHGRTHPNTWSPIYSINEFIRFLSEFFTSTESELLGKHAVINLEKE 
T3RLVHNITAMLHTDVF0EALLTRI LEAYQLPVPPS I LNHLDQLSQTPWVYVSGGTVDTL 
LLDYFE33EPLTLTEKHPENPHELAAFYADALKDLPTGIKSYLEEGSHSLLSSSPTHVFS 
[ I AGS PLFR EAWDNDWYSYTWLRDVWVKQHQDFLQDT ILPQLS IYAFI ENFCNK YALQHV 
VHDFHDFC3DHSLTLPELYDKGSRFLSSLFTKDKT/ALIYIRRLLYLMVREVPYVSEQQL 
P E V LDNV3 3 Y LG 1 3 : : R I TY EK F RS L I £ ET I PKMTL L33ADLRH I YKGL LMQ 3 YQK I YT EE 
U P Y LRLTTAM RH UN LA Y PA FLL FAD3NWPS 1 Y FG F I LN PGTT E I DLWK FNY AG LQGQPLD 
N EQELFAT3RPWTLYANP [DYGMPPPPGYR3RLPKEFF 

f.'TVi') hypnt ■ ht?t u'.* 1 protein 

fcHHL EN I FY I T R EMKHTFTKRVLFFFFLV t P I PLLL!ILMWnFF''n*AAKANLVQVLHTRA 
TfiLG I EFEKKLT EHKLFLDRLANTLALKSYA3 P':IAEPYA0AYNr.MMAL3NTDF3LCLEDP 
Fr/:3VRTFra^DPFIRYLKQHPEMKKKL3AAVGKAFr,LTrrf;KPLLHYLrLVEDVA3WDS 
TTT 3G L LV 3 F Y PM 3 F LOK D L FQ S LH 1 T KG NIC L VN K V ) CVL FC AQD 3 Ci 1 3 FV F3 LD L PN L 
PQF0AR3P3AIEEEKA3GILGGENLITVS INKKRYLGLVLNK [ P L0GTYTL3LVPVSDL I 
Q3ALKVPL,NICFFYV[,AFLLJ^WWTF3KrNTKLNKPL0nLTFf MKAAWRGNHNVRFEPQPY 



GYEFNELGNrFNCTLLLLLM3IEKAD[DYM^EKL,OKZU;:L - ,L„V.ALL.,PCFPTFPKV 
TF^3QHLRPRQL3GHFNGWT/0rx>GDTLLG[ IGLAGC TO LP JYL f AL3AR3LFLAYASSD 
V3LQK r3KDTADSF3KTTECNEAWAMTF rKY\'EKDP3LELLJL3EGAPTMFLQRGESFV 
RLPLETHQALQPGDRL [CLTGGED r LKYF3QLP I EELLKDPLNPLNTENL :OSLTMMLNN 
ET EH SADGTLT I LS F3 

CPn_07?3 fl'ih«38 S [ >49L'^ 

rbsr;-s iqm.i re<?iliror/ E.imLly prnrein-PFHC cnosphat^se (RsbW 

i: *■ c u : ' > 

: i m ' ' T-"" * - ' ^ '• ' % ;^ 1 ' ' * \ r :.F T 3N/w\r'-* a 

NTLTQIVPLNVDVL3LF3DVLDLDAG I i- ET PNV'LLo N EMOKV F'QG I YNE 1 3 L I KVF PNG D 
KI WASSIPEHLGEN-^NHKIDI PKNTPFLAALKQSPKNQEVFSVMQANVFDAKTQELQG I 
LYTTFSAESULJCDLLI^QSYLTVTCTAILSKYGVILKASD 

FLNDDPCPIDSELGPLTLSPLDIGENFYSFKIKDTEIWGCIENVPSIDIAVLSYAKKEES 
FAPLWRRARMYTAYFFC ILLGSLI AF IVARRLSLPI RKLATAMIESRKNKNCLYTDDSLG 
FE INRLGH I FNAMVENLHKQQHLAKTNFEMKENAQNAi^LGEOAQQRLLPNTLPSYPHIE 
lAKAYIPAITVGGDFFDVFVVGEGSKARLFLIVADASGKGVNACGYSLFLJaJMI^ 
SSSLCX3AIQETSRLFYNOTKNSGKF\TI^VYCY^C/rSNTME^ 

WLFHPGMAI^FLPEVANITSKLFHPKPGSLFVLYSDGITEAHNN^DMFGEERX>QAAIQG 
LTGKSAADAVHRLMLSVKTFVGNSHQHDD ITLL I LKVLES 

CPn_0794 897123 898004 

No robust homolog present m Genebank/EMBL as of 11/7/98 
KS S KHRSFLLKKSGGNQVS LYQKWWN SQLKKS LCYSTVAAL I FM I P SQ ESF ADS L I DLNL 
GLDPSVECLSGDGAFSVGYFTKAGSTPVEYQPFKYDVSKKTFTI LSVETANQSGYAYG I S 
YtXTT ITVGTCSLGAGKYNGAKWSATCTLT PLTG I 

SGQPKAVCWASGATTVTQLADI SGGSRSSYAYAISDDGT I IVGSMEST ITRKTTAVKWVN 
NV FTYLGTLGG DASTG LY I SG DGTV I VGAANT ATVT^^3NO ES H A YMY K DNQMK D 

CPn_0795 898008 899195 

No robust homolog present in Genebank/EMBL as of 11/7/98 
GTIjGGANSSATGVSSCGSVIVGQACTADKSVHAFQ 

gkvib^rsoiadgswhafmchtdfssnnvt^dl^^ 

asdheftefgrsnialgaglywai^^pskiaaqyfgiay^irpkytu/t^ldhnfss^ 
vpnnfwshnrlv^atigwqdsdalgssvkvsfgygkqkatitreqleot 

EGVAAQ I EGRYGKS LGGHVRVQ PFLGLQFVH I TRKEYT ENAVQF PVHYD P I DYSTGWYL 
GIGSH IALVDSLHVGTRMGMEQNFAAHTDRFSGS I AS IGNFVFEKLDVTHTRAFAEMRVN 
YELPYLQSLNLILRVNQQPLQGVMGFSSDLRYALCF 

CPn_0796 899280 901340 

No robust homolog present m Genebank/EMBL as of 11/7/98 

S ELYSSYLQPCLNMS IVRNSAL PLPC LSRS ETFKKVRSHMKFMKVLT PWI YRKDLWVTAF 

LiTAIPGSFAHTLVDIAGEPRHAAQATGVSGDGKIVIGMKVPDDPFAITVGFQYIDGHLQ 

PLEAVRPCCSWPNGITPDGTVIVGTNYAIGtfGSVAVKWV^KVSELPMLPOT 

VSADGRVIGG^JRNINLGASVAVKWEDDVITQLPSLPDAMNACVNGI SSDGSI IVGTMVDV 

SWR^AVQWIGDQLSVIGTLGGTTSVASAISTDGTVIVGGSE3^ 

IGTLGGFYSLAHAVSSDGSVIVGVSTNSEHRYHAFQYADGQMVDDG 

DGKVIVGRAQVPSGDWHAFIXPFQAPSPAPVHGGSTVVTSQNPRGMVDINATYSSLKNSQ 
QQLQRLLIQHSAKVESVSSGAPSFTSVKGAI SKQS PAVQNDVQKGT FLS YRSOVHGNVQN 
C^LLTGAFMDWKI^APKCGFKVAXJf^GSQDALVERAALPYTEQG 

RY DFNLG ETWLQ P FMG I QVLH LS REGY S EKNVRF P VS Y D SV A YS AATS FMG AHVFAS LS 
PKMSTAATLGVERDLNSH I DEFKGSVS AMGNFVLENSTVSVLRPFASLAKi^IAmOOQLV 
TLSWMNQQPLTGTLSLVSQSSYNLSF 

CPn_0797 901552 902694 

No robust homolog present in Genebank/EMBL as of 11/7/98 
VL ILTWI^A^TKIJGL^WSKKIKVU3HLTIXr^LFRGVLCAAALSN IGYASTSQESPYQKSI 
EEWKGYTFTDLELLSKEGWSEAHAVSGNGSRIVGASGAGQGSVTAVIWESHLIKHLGTLG 
GEASSAEGISKDGEVVVGWSDTREGYTHAFWDGRTMKDI^LGATYSVA^ 
VGVSATARGEDYGV^VGVKWTKGKIKQLKLLPQGLWSEANAISEDGTVXVGRGEISRNHI 
VA VKWNKNA VYS LGTLGG S VAS AEA I S ANGKV I VGWS TTNNG ET HAFMHKD ETMH DLGT L 
GGG FSVATGVSADGRA I VG FS AVKTG E I H AFYYAEG EM EDLTTLGG EEARVFD I S S EGND 
I IGS IKTDAGAERAYLFH IHK 

CPn_0798 902810 903856 

No robust homolog present in Genebank/EMBL as of 11/7/98 
WFEI I FWRVPMKKTCCQNYRS IGWFSWLFVLTTQTLFAGHF I DIGTSGLYSWARGV 
SGDGRVWGYEGGNAFKYVDGEKFLLEGLVPRSEALVFKASYDGSV I IG I SDQDPSCRAV 
KWVNGALVDLG I FSEGMQSFAEGVSSDGKTIVGCLYSDDTETNFAVKWDETGMWLPNLP 
EDRHSC AWDAS EDG SV IVGDAMGS EE IAKAVYWKDGEQHLLSNI PGAKRSSAHAVSKDGS 
F I VGEF I SEENEVHAFVYHNGVI KDIGTLGGDYSVATGVSRDGKVIVGHSTRTDGEYRAF 
KYVDGRM I DLGTLGGSASFAFGVSDDGKTI VGKFETELGECHAF I YLDD 

CPn_0799 905001 903940 

No robust homolog present m Genebank/EMBL as of 11/7/98 
KREENMAAIKQILRSMLSQSSLWMVLFSLYSLSGYCYVITDKPEDDFHSSSAVKWDHWGK 
TTLSRLSNKKASAKAVSGTGATTVGFTKDTWSRTYAVRWrA^GTKELPTSSWVKKSKATG 
ISSDG3 1 1 AG IVENELSQSFAVTWKNNEMYLLPSTWAV^SKAYGISSDGSVIVGSAKDAW 
SRTFAVKWTGHEAOVLPVGWAVKSVANSVSANGS 1 1 VGSVQDASG I LYAVKWEGNT ITHL 
GT LGC Y S A r AKA VSNNG KVI VG RS ET r/G EVH AFCH KNG'^MS DLGT LCG5 YS AAKG VS AT 
G KV I VGMSTTANGKLHAFKYVGGRM I DLG EYSWKEACANAVS IDGEI IVGVQSE 

CPn_OBO0 ^06550 90524'* 

eno- Eno lase 

RKEIKIMFE-WIADI0AREILDSRGYPTLHVKVTTSTC37GEARVP3GA3TGKKEALEFR 
DTD3PRY0GKGVL0AVKNVKEILFPLVKGC3WEQSLID3LMMDSDGSrNKETLGANAIL 
GV3 LAT Af f A.AAAT LR R P LY RYLGGC F AC S L PC PMMNL I HOGM H -\DNG LE FQ E FM I R P I G A 
33 IKEAVNMGADVFHTLKKLLHERGLSTGVGDE^JGFAPf JLASNEEALELLLLAIEKAGFT 
P(5 K D 1 3 L/\ LDt ' A^\3 3 FY NV KTGTY DG P I- 1 Y EEO t A I L 3 MLC DR ^' P [ D3 I E t\T LAE EDY DGW 
ALLTEV Lj 1EKVQ TV*. IPPLFVTN PEL I LEG ISMGLAN: jVL I KPNO I ( ITLTETVYA I KLAQM 
AG YTT I I H R ■ '.G I rnTTT t A DL AV AF NAGQ I KTG3 L 3 F' 3 EP V -\ K Y M V I ,M E E E EELG 3 EA I 
FTTOJVF.IYtD-IF.F 

f;pn_0»0l 'Hl^/O'j ')0t',727 

uvl [' Kxmuh I i ■ !■ . - ABC 3ubunit H 

E I FTMTFfJLMAI'rMWII VFFA [APL3AGVRN0*/K3QV r L/;TH : VWITV 1 AflVVANVTIL 
PTL.Vr.AHNKTt.AA01 A OFFIlFFFPMTJAVEYFr 3YYDYY0P3AY I AP3I)T\ t FK3LI. TNDC 
EDKLPL3ATR3I [.L:i<Kin , LtV:.3V3f:rYO[G3PENYT3MALVLEVr;K!:\ PRNI LTAOLVK 
MHYfjA :P I P0R3AFR! "lit !. IV ! i 1 1 FPAYF.3KLALHLEFLI lUrUVl) E FY: '.Dt*! ,'I'M I I 'K33VP 
3ATLYPf^:;iiYV f PCA I RFOA I WT [ 0 F f . [ .K F.RM A F F D DP P EEKDR L FHRTTHTj I f >M EKF U\ 
!^rKf;iFNY3RHITO\iJx;AlM'n;Ll.t>Y!'MM-:DFLL[ IDn:H0TLi\>I^AMVI<Gt/J , ,RK'03L 
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Vrf^FRU'.'JAFDNRPLTYEEAOKYFRr/TYVGATPGDTEVQESSGHIVOQirRPTGIPDP 
MPErRFATGUVDDLLEEIRLPLJQKHEKILVI3ITKKLAEDMACFL3ELEIPAAYLHSGI 
ETAERTQ I LTDLRGGV r DVL IGVNLLREGLDLPE'/SLVAILDADKEGFLRSTSSLIQFCG 
RAARN [NGKV TFYADOKTRSI EETLRETERRROrQLDYNKEHNIVPKPI IKAIFANPILQ 
TSKDSESPKECQRPL3KEDLE£QrKKYEAt>K2RAAKEFRFNEAAKYRDAMQACKEQLLYL 
F 

• : vpr.' ■! i r , ! ' "*!A .Syn' :i< r i"'- 

i, \nvMNKKKPv*. ' ii-i-.'r'.^LiiL'jiiWv'i^rr'^iXLunrLY^ * : \OLir*:.nY 

I RKEEVLDVDNH I YEVLADWL3VG I DPTKS 1 1 YLQSAI P E I YELHLLFSMLI 3 INRVMG I 
PS LKDMARNAS I EEGS LSYGL IGY P I LQSAD I LLAKAQFVPVGKDNEAHVELTRDI ARNF 
NRLYC^VFPEPEVLQGELTSLVGrDG<^KMSKSANNAIYLSDSDATITEKVRKMYTDPNR 
IRATTPGRVEGNPLFI YHDI FNPHKDEVEEFKARYRQGC IKDI EVKARLAEELIHFLKPI 
K ERRS E FLSKPLAI^NVLEIXTrHKMREVAKVTMEEVH DKFG FSHKWRSLLK 

CPn_0803 910306 909752 

CT584 hypothetical protein 
FMAAKTICTLELEDhA^LLLEG^KRIFATPIGYTTFREFQ 

L INGK LTQE LAPQQKQ AAHSL I AEFMMP I RVAXD I H ERG EFINF ITS DMLTQQERC I FLN 

RLARVDGQEFLLMTDVQNTCHLIRHLUVRI^ 

TKALQ 

CPn„0804 911074 910310 

gp6D-CHLTR Plasmid Paralog 

E I FSSMGNLKTLLESRFKXOTPTKMEALARKRMEGDPS PLAVRLSNPTLSSKEKEQLRHL 
LQHYNFR EQ I EE PDLT QLCTLS AEVKQ I HHQSVLLHGER ITKVRDLLKSYREGAFS SWLL 
LTYGNRQTPYNFLVYYELFTLLPEPLKI EMEKMPRQAVYTLASRQGPQEKKEEI IRNYRG 
ERKSELLDRIRKEFPLVETTCRKTSPVKQAIJ^TKGSQILTKCTSLSSDEQIILEKLIK 
KLEKVKSNLFPDTKV 

CPn_0805 911346 911067 

minD- chromosome partitioning ATPase-CHLTR plasmid protein GP5D 
G YASRMKT I AVNS FKGGTAKT S TTLH LG AALAQYHQ ARVLL I D FDAQANLT S GLGLDPDC 
YDSI^VVLCGEKEIQEVIRPIQDTQLDLIPAiyrWI^RrEVSGNIAADRYSHER 
VQDKYDWIIDTPPSLCWLTESALIAADYALICATPEFYSVKGLERLAGFIGGISARHPL 
T I LGVALSFWNC RGKNNSAFAEL I HKTFPGKLLNTK I RRD ITVSEAAIHGKPVFATS PSA 
RASEDYFNLTKELLILLRDI 

CPn_0806 913816 911667 

t&TS-Threonyl tRNA Synthetase 
NA^Sfl^SPPNMEAWbn^IQVTCDQKNYEV^ 

THd^I^LVFLTSEDPEGREIFI^SAHLLAQAVLRLWPDAIPTIGPVIDHGFYYDFAN 
LS'fSESDFPLI EDTVKQIVDEKLAI SRFTYGDKQQALAQFPQNPFBCTELIRELPENEEIS 
AYiSQGEFFDLCRGPHLPSTAHVKAFKVLRTSAAYWRGDPSRESLVRIYGTSFPTSKEIJlA. 
HI»EQ I EEAKKRDHRVLGAKIiDLFSQQ ESS PGMPFFH PRGM IVWDAL IRYWKQLHTAAGYK 
EI&TPQLMNRQLWEVSGKWWYKA 

PI^R^AEVGHVHRQEASGALSGrj^VRAFHODDAHVFLTPEQVEEETI^ir^LVSTLYGTF 
GH ; ^HLELSTRPEKI>rrGDDSLWELATDALrHlALVQSGTPF I WPG EGAFYGPKIDIHW 
DAH &RTWQCGT I Q LDMF L P ERFELEYTT AGGT KSVP VMLH RALFGS I ERFLG I LI ENFKG 
RFPLWLS PEQVRI I TVADRHI PRAKELEEAWKRLGLWTLDDSSESVSKKIRNAQNMQVN 
Ylfj^^DHEINENyLAVRTRD^ ILEEKNSLSLTALL 

CP|B0807 913950 914879 

CT'58'0 hypothetical or o tern 

TEj-QTGLHMSLFLVFLTAF I WS S S FALSKLVMNAS AP I FATGARMVT AGA I LALAAWFRGG 
FVGISKKIFLYIVLLALTGFYLTNIFEFIGLQSLSSSKTCFIYGLSPLMSALFSYIQLKE 
KViffiKK\/TCLSI^LVSYICYLTFGGGGDDSQPWI^ 

I E^QSTLSVTAINAYAMLI AGML S IMHS AWEPWRPLPVQD I SQFLYAT LALWr SNLI C 
YNCYAKLLRKYS STFLSFCNLVMPLYSGFYGWI L LG EKGVS LGLVLAVA FMVAGCR L I Y H 
EEfRQGYIVS 

CPiup808 916398 914956 

CT579 hypothetical protein 

LfGfiPSWALKSLKRMPQSAEPSLAH I KP 1 1 FKGAC I AMTSGVSGSSSQDPTLAAQLAQSS 
QKAjjgNAQSGHDT KNVTKQGAQAEVAAGGF EDL I QDAS AQSTGKKEATS STTKSSKG EKS E 
KSG]§3 K S ST SVAS AS ETAT AQAVQG PKGLRQNNYDS PSLPTPEAQT ING I VLKKGMGTLA 
LLGL^TLMANAAGESWKASFQSQNQAIRSQVESAPAIGEAIKRQANHQASATEAQAKQS 
LI SG I VN I VG FTVSVGAG I FSAAKG AT S ALK S AS FAKETGAS AAGG AAS KALT SAS SSVQ 
QTMASTAI<AATTAASSAGSAATKAAANLTDE5MAAAASKMA^ 

WS EKVS RGMNWKTQGARVAS FAGN ALS S SMQM S 0 LMHGLTAAVEG LSAGCTG I EVAHHQ 
RLAGQAEAQAEV LKQMS SVYGQQAGQAGQ LQ EQ AMQ S FNT ALQT LQN I ADSQTQTTSA I F 

N 

CPn_0809 917794 916307 

CT578 hypothetical protein 

OTNMSISSSSGPDN0KNIMSQVLTSTPCGVPQ0DKLSGNETKQIC^TRCH3KNTEMESDAT 
I AGASGKDKTSSTTKTETAPQQGVAAGKESS ESOKAGADTGVSGAAATTASNTATK I AMQ 
TSIEEASKSMESTLESLQSLSAAQMKEVEAVWAAL5GKSSGSAKLETPELPKPGVTPRS 
EVIEIGLALAKAIQTLGEATKSALSNYASTQAQADQTNKLGLEKQAIKIDKEREEYQEMK 
AAEQKS KDL EGTKDTVNTVM I AVSVA I TV I S I VAA I FTCGAG LAGLAAGAA VGAAAAGGA 
AGAAAATTVATQ I TVQA WQAVKQA V I TA VRQA I T AA I KAAVKSG I KAF I KTLVKA I AKA 
I SKG ISKVFAKGTCM I AKNFPKLSKV I SSLTSKWVTVGVGVWAAPALGKG I MQMQLSEM 
QQ NVAQ FQK EVG KLQAAADMI SMFTQ FWQQAS K I AS KQTGESNEMTQKATKLGAQ I LKAY 
AA I SG A I AG AH KTNN F 

CPn_0310 l USL93 <U7825 

CTS77 hypothetical protein 

GEIWIKKPKKTKKAVQSKAAPVKRVPEESQEAAIQQLELAVSDLYKELPLAQTFASLTDK 
NUfWSr lAAL^TLEnLHI.EELTCCLFrf.AQED^NFAKELS'JWHGLKNLTTVVNKQMVK 
CAE 

CPn_0fU I UtW3J HS2Q8 

Icrli-Low i '.j Ui*- pont.o Protein H 

L Q NM F T U ', r !< : : M : : K I"- : P f< N ANO POK PS A: ; I ^ J K KT P r , p L A L LA AU K K -\ K A 1>Xj L F.Q VH p V PT 
MEE [KKAU IN I FI-MI/IPJGLDLOO tUJL^DYLLLSI ^TVAYTfY'IQC JKYNFAV^LFQLLAA 
AAJP^NYKYMU ILS^.rYHOI.MLYNCAAFOFFLAFDAOPDNP [Pf'YY [ APSLLKLQCPEE3N 
NF'LDVTMI; { ' :< ;NNrL;FK n.KERCQ EMKO' ~ EEKOMAfiCTK KAI 'T^KPAt IK' IKTTTNKK.'JGK 

KR 



mut L - DNA Misma t ch Repel i r 

G I L ECWLGNLTKAPM3TRRP IQLLDPLT I NQ I AAGEV lENJV; WK EL I EN3LDAGADEI 
E I ETLGGGCGA III RDNGCGFRAED I P EALQRHATSK TREF-'IDI F3LN5FGFRGEALPSI 
A3 1 3KME IQSS I ECDEGVRTV I HCGD rVSCEPC ARQLGTTV I VNSLFYNVPVRRCFQKSM 
Q3DRLG IRKLI ENR I LSTAN IGWSW r SEGHHE IQ I AKQQG FQ ERVAYVMGDH FMQDALT I 
DKEANGVRIVGVLG3P5FHRPTRQGQKIFINDRP [E3LF ISKfO/GDAYALLLPLHRYPVF 
VLKLYLPSSWCDFTT/HPOKI EARI LKEELVCDC I KEAIVETLAC PPG ILCRTHQE I EESD 
SVPLPMFRMLETSDVQEEESVEFX^NLFAYSSEDV3LEKQFYT3RGPKSWDWIYSSDVR 

■ ' r M:^7 , .Ai"/. ,,| rr r "'\u" " r M - . ^. 

ALMKETLTQATF3KHQHVFDVSWLKLLW3VGKF EKGFtXJAKIRRLILDSDFMEG 

CPn_0813 920843 921934 

pepP-Ammopept idase P 

TLI LWKDNHMSHDRI LRAQRALS EHNLDA I LVEKSEDLAYFLHDEAI AG ILL IGQQEVMF 
FVYRMDKDLYSH IQRVPLTFLTQDWADLSLYVQKQRYQK IGFDSASTVYHKFAQRQVLP 
CLWEPLECFTEKIRS I KSEEE I RRMQEAAALGSAGYDYVLTLLREG ITEKEWRQLRAFW 
AEAGAEGPSFPPI IAFGEHSAFPHSI PTDRPLKKGDIVLIDIGVLLNGYCSDKTRMTALG 
TPHPKLLESYPVWEAQKRAMALCKEGVLWGD I DAEAVRVLREHHLDTYFIHG IGHGVGR 
H I HEY PCSPRGSQVKLESGMT I TVEPGVYF PG I GG I R I EDTLC I DKNKNFS LTARPVI S E 
LVCL 

CPn_0814 921996 923357 

CT814.1 hypothetical protein 

FFLFFKLSYNFIFNLPLTMYQLLS IGYS FVSF IALLWMLCYSPNYVTDLYR I SLSAEESL 
GGIRAFPQAESLIX3GACALNFPDLEERLPDLRKELLFLGSNDRFDACGGKFSLQLASSKE 
CY IAALKERVYLNVTNSSRGPVYSFSPKGVPTELWI ECFSVSVDGRVEVKVRLQGLHKEL 
ISKPRIXTETLFLNPPANKLDCWEIAGFRVDASFPVKQKIR^IGVDKFIXMHGGAEYADKA 
TKERVDFVSSDEE^SRYIAVXJDVLLWDGNCWQTCGEFQGASSRAPLFEVKR I DDKVMIA 
DLWNVGGTOROTISLVKGVPSPIEINEVIREIEFTG^SWSKPIVLVGGCRLILS 
LRTAKGWEKLSRADQ IQDYVTG KVTG PLLVFEKLEKDLRGFVLRG KMFNAQRTLVET I SL 
PLKQGFE PAVASQ EVSS NTRSAAAHPGATNRGGS 

CPn_0815 923361 925622 

gspD/pilQ-Gen. Secretion Protein D 

M^FRNSLLHLVAI^GMLCCSSGVALTIAEKMASLEHSGRGADDYEGMASFNANMREYSL 
QLSKLYEEARKLRASGTEDEALWKDLIRRIGEVRGYLREIEELWAAEIREKGGNIXDYAL 
WNHPETTIYNLVTDYGTEDSIYLIPQEIGAIKIATLSKFVVPKESFEDCLTQILSRLGIG 
VRQVNSWI KELYMMRKEGC SVAGVF S SRKDLEALPETAY IGFVLNSNVDAHTNQHVLKKF 
INPETTHVDVI AGRVWIFGSAGEVGELLKIYNFVQSES IRQEYRVI PLTKIDPGEMIS I L 
NAAFP^DLTKDVSEESLGLRWPU2YC<3RSLFI^GTAALVQQAI,TLIRELEEGIENPTDK 
TVFaYNVKHSDPQEI^AI^SQVHDVTSGENKASVGAAT^ 
GSVKYGNFIADSKTGTLIMVVEKEVLPRIQMLLiaa.DVPKKMV^ 

GLNLLRLGEEVCKKGC S PSVSWAGGTG I L EFLFKGSTG S S IVPGYDLAYQFLMAQEDVR I 
NAS PSWTMNQT PAR I AWDEMS IAVSS DKDKAQYNRAQYG I M IKMLPV INVGEEDGKSY 
ITLETDITFDTTGKNHDDRPDVTRPJ^ITNKVRIADGETVIIGGLRCKQMSDSHrciP 
DI PG IGKLFGMS ST SD SLT EMFVF IT PK I LENPVEQQERKEEALLS S RPGER EEYYQALA 
ASEAAARAAHKKLEMFPASGVSLSQVERQEYDGC 

CPn_0816 925600 927102 

gspE-Gen. Secretion Protein E 

RGKNTMAAS ILSQELLDILPYTFLKKHCLLPI EESSEAIT IAHATATSVIAQDEVKLLIK 
K PVR FVLKE ES E I LQR LQQL YSNREGNVS DMLLTMK EEDGTT I S E E E DLLETTET I PWR 
LLNWILKEAIEERASDIHFEPCEDSMRIRYRIDGVLHDRHSPPSHLRSALTTRIJCVLAK^ 
DIAEHRLPQDGRIKIHIGGQEVDMRVSWPVIYGERVVLRILDKRNVILDIAGLHMPKGT 
EILFKDTITAPEGILLVTGPTGSGKTTTLYSVI^ELKGPLTNI^IEDPPEYKLPGIAQI 
AVKPKIGLTFARGLRHLLRQDPDI LMVGEI RDQETAEI AIQAALTGHLWSTLHTNDAI S 
AIPRLLD^IESYIXSATLVGWAQRLWTICPYCKVAYTPENQEKSFLASl^KDTEMPL 
YRGQGC VHCFRSGYKG RQG I YEFLR PNTLFRS EVAS NR PYH I LRETAEQNGFL PI LEHG I 
ALAVSG ETTLAEVLRVTKRC D 

CPn_0817 927106 928287 

gspF-Gen. Secretion Protein F 

GGRMPRYRYTYLDPKERRKRGYLEALHIQEAREKLAQENIQVTJ5IREVALRRMSIKSTEL 
IVFTKQLLLLLRSGLPLYESLVSLRDQYHEQKMGLLLTSFMETLRSGGSLSQAMAAHPNI 
FDHFYCSGVAAGESVGNLEGCLQNIIWLEERAQITKKMVGALSYPCVLLVFSFAVMLFF 
LLGVIPSLKETFENMEVKGLTKIWGVSIXZLSAYRYLFLGFASALIWGILMRHRIPWKK 
I LEKLLFALPGTKKFVVKVAVNRFCS VAS AILKGGGTLI EGLDLGCDAI PYDRLKTDMRD 
IVQAVIGGGSLSQELAQRSWVPKLAIGMI ALGEESGDLADVLGYVAH I YNEDTQKTLAS I 
T SWCQ PVI L I FLGGL IGV I MLAI L I PLTSN I QTL 

CPn_0818 928158 928682 

predicted OMP [leader (16) peptide] 

GYTKNVGFDNWV3TRDSDFSWWPDRCDH VGN I DPTHKQYPN I IKCVLRGVGMKRQKRKQ 
SITLIEMMWITLIGI IGGALAFNMRGSIHKGKVFQSEQNCAKVYDILMMEYATGGSSLK 
EI IAHKETVVEEASWCKEGRKLLKDAWGEDLIVQLNDKGDDLVIFSKRVQSSNKK 

CPn.Oei^ 929117 928^56 

CT568 hypothetical protein 

ASLYGYCLFLIWEKFHNNIGKAJJFHLKI ITTDFLTDIYIVTIRDPIAYPLTGIC 

CPn_0820 929042 929659 

CT567 hypother Leal protein 

DESLPCRCCCGTFPRSET3S I RTEMPMCNSIAMKKQKRGFVLMELLMSFTLIALLLGTLG 
FWYRKIYWQKQKERI YNFYIEESRAYKQLRTLF3MSL3SSYEEPGSLFSLIFDRGVYRD 
PKLAGAVRASLHHDTKDQRLELRrCNIKDQSYFETQRLLSHVTHWLSFQRNPDPEKLPE 
TIALTITREPKAYPPRTLTYOFAVGK 

CP!i_082l -M'Kii? 'J30bf,H 

i*"rsr>j hypnrhci L,\ii protein 

HTNt.PLGNKPMQPF I FTLLrLTfUiV^LVAFDAA^JARKPCACAQT TERGKNTFr; IKRSACA 
F. I EYO f : K ^ R M Ai ; A [ LR T .'; K DKf ; K VT PK <j I AKVATK KKQR Y R LLy V P R P V NN3 R YN LY A 
L L P r P CC Y S DT A: 'l\"t' A £ F [ R L LP R A Y VDTOr P/ I'tt .1 E Y A I AN A I j [ KQ E I L E PC AQLG 
WV I r TLTLnX.C*AE [ FYKMt.KfJ'J'JNflfjfXLNKLMYEEK.'jLGIK'KLNIj \ FMDPLLLEAVL 
\mvi >AYRC'I\ :i .1 .HIV [ WFAVKPOKt IA 1 OKI K^AAALCL.FFTRTDFRIXI.HPKMOLLLSR Y 
l^l.LPLI J4KKMfTJYTr*;:;A( II »YT .[-7 A/bH/FKA I :.£'f 'HCP".yr, I KL 

iT'k.'. tiypor hft i. il pr'ji « in 

i'fli tVLi;rnKNu:if :i^t'MAf4:Ti'Ki'rj:::;K[::; r ;.';oFD f ;LKRKVKDLii:;NrKv;KWKKFL 
• ;m ra< 'ka u;r.r i. vr w ;i taixm .vt/vviir im y ;v , /i ,i ifuviz e rkmlunlo^'y itangpik 
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NAELOiLE LFFVLNI PSFAVSF IVLCVt L3FTTTAPSCSTCSKDHCDKHQDTSNKPS 

. CPn_082J "M24J4 03 1501 

yscT/apaR - YopT TranLocat ion T 

FYALQVRFS KTS I NGNKELMG I 3L PELFSNLGSAYLDY I FQHP PAYWSVFLLLLARLLP 
r FAVAP FLCAKLFP3 P I K IG 1 3L3WLAI I FPKVLADTQ ITNYMDNNLFYVLLVKEM I IG I 
V IGFVLAF P FYAAQSAC3F ITNQOG I QGLEGATCL I S I EQTS PHG I LYHYFVT I IFWLVG 
OHRrVTSLLIX^^FVTPrHSFFPAEMMSLSAPIWTTMIKMCOIXLVMTrQLSAPAALAML 

m::i '[.Fi, ;i ; r;rM-\rvV'.vi ■ \, . ■.-•r*;t,;,. * t v.vwr [ moil /r ^i .AwrM.YprM:. 
;.■ ;.:rnw'. 

CPn_0824 932677 932378 

yscS/f liQ-YopS/f liQ Translocation Protein 

IRTRAVLAFFATSFKSVLFEYSYQSLLLILIVSAPPI ILASIVG IMVAI FQAATQIQEQT 
FAFAVKLWIFGTLMI SGGWLSNMILRFAGO I FQNFYKWK 

CPn_0825 933618 932677 

yscR-Yop Translocation R 

ERI KVFT IMRS I FRFSLC FFTL SVSCC FADAS LYENSCPSRCQPT P P PSNSNPLNWQQ P 
VAASSVPSYMPPLNADDVLPRDHLSDGSFSOTYPDITT<JAriLIFLALSPFLVMLLTSYL 
KIIITLVLLRNALGVC^TPPSQVLNGIALII^IYVMFP 

FTAEGAETVFVALNKSKEPLRSFLIROTPKAQIQSFT^rSQKTFPSEIRAHLTASDFVII 
IPAFIMQQIKNAFEIGVLIYLPFFVIDLVTANVLVAI^Ml^SPLSISLPLKLLLIVMW 
GWTLLLQGLMISFK 

CPn_0826 934382 933612 

yscL-Yop Translocation L 

H DNKR SGVF S S EVNQ PQRYYA I VKMK FF S L I F KDDDVS PNKKVLS P EAF S AF LD AKELL E 
KTKADS EAYVAETEQKCAQ I RQEAKDQGFKEG SESWSKQ I AF LE EETKNLR I RVREALVP 
LAIASVRKI IGKELELHPETIVSI ISQALKELTQNKHI I ISVNPKDLPLVEKSRPELKNI 
VEYADS L ILTAKPDVTPGGC 1 1 ETEAG 1 1 NAQ LDVQLDALEKAFST I LKAKNPVDEPS ET 
SSSTDSSSLSNDQDKKE 

CPn_0827 935273 934434 

CT560 hypothetical protein 

GCLVTANTFGTLDILMKHSKEDDLSRF1.PKNLLVESPHPEEIPIJCSLSFTMSWLPTIHPS 
WITI AMKEFPPEIQGQLLAWLPEPLVQEI LPLLPG I S IAPHRCAPFGAFYLLDMLSKKIR 
PCGITEEIFLPASSANAILYYTGPVKIALIKCU3LYSIAKEIJCHrU>KWrERVKNALSP 
TEKLFLTYCQSHPMKHLETTNFLSSVrTTDA 
RRIJDVGRAY I VEQTLKTWYDH PYVDYFKSRLEQOKVLVK 

CPgj0828 936292 935267 

ysGJ^-Yop Translocation J 

IKRYVAWIMVRRS I SFCLFFLMTLLCCTSCNSRSL IVHGL PGREANE IWLLVSKGVAAQK 
L PQAAAAT AG AAT EQMWD I AVP SAQ I T EALA I LNQAGL PRMKGT S LLDL FAKQG L VPSEL 
Q E^I-RYQ EGL S EQMAS T I RKMDGWD ASVQ I S FTT ENEDNLP LTASVY I KHRGVLDNPNS 
I^SKIKRLIASAVPGLVPENVSWSDRAAYSDI 

LTK&RL I FYVL I L I LFVI SCGLLWVIWKTHTL INfTMGGTKGFFNPT PYTKNALEAKKAEG 
AA#©pKKEDADSQGESKNAETSDKDSSDKDAPEGSNEIEGA 

C?jy0829 936729 937298 

No ! :,,r„obust homo log present m Genebank/EMBL as of 11/7/98 
KYf C^VPTLAKSFYINIRDSRFYSWLCFIMKETYYR^ 

FFLANAKWPLVP AGYRRVRGKDFVLS PLVDLV I LFPWVTKDSRYS PCSMTFTC ICR S I VE 
C I £WSTLFG IGRFCAVWCVEGFSGSTFDKIYHT IVAVLGILGLGILTF ILRI IFSVLML 
PVWFLFKCYS 

CPnJ?830 937339 937959 

No = robust homo log present in Genebank/EMBL as of 11/7/98 
DSC^fLLPCFEVEAQTFPQVFSKVWYKYKSSR I LL I ALLYN ITLVLGL I F IHKKYLGQK 
GRVILKIYQNEEEFFRATERFPSIGAGYLRVRNKNSVLFPFEDLMLVCPSVPKDFPLSAF 
KVTTKLIYWSVLESI PWGAFFFSIGRLFAMWC I EDFPGSIFSRIYHTIVGVLGILGLGI 
I MF I LR I I FT LLTL PFWL I SC LKS S AA 

CP0331 938249 938434 

NoiiE^bust homolog present m Genebank/EMBL as of 11/7/98 
^^^N^IRKSESEGAFFEATQ^PTIQQGYQL's/RIREHNIjSVRAHFDLSLSLDASVHP 

aa *y 

CPn_0832 939750 938827 

lipA-Lipoate Synthetase 

VMKCRPTLNTDQPRVRKKLPERFPKWLQR PLPQGSAFHATDAT I KRSGMPTVCEEALC PN 
RAECWSRKTATYLAIjGDVCTRSCGFCNIGHSKTPPALDPTEPERIALSAKELGLKHWIT 
MVARDDL EDGGAQGLVD 1 1 QKLREEL PQATT EVLAS DFQGNVS ALHTLLDSG ITIYNHNV 
ETVARLS PLVRHKATY ARSMFMLEQAANYLPDLK I K3G I MVGLGEMEGEVKQTLQDLAS X 
GVR I VT IGQYLR PSRKHLQVKSYVT PETFDYYRRVC EAMGLFVYAG PFVRSS FNADM I LA 
SVQDKASA 

CPn_0833 ')4 1171 939747 

lpdA-Lipoamide Dehydrogenase 

RGVLFEILI TVS ENMTQ EFDCVV IGAG PSG YV AA I T AAQ S KLRTAL I EE DQAGGTC LNRG 
C I PSKALIAGANWSH IKHAEQFG IHVDGYTIDYP^MAXRKNTWQG IRQGLEGLIRSNK 
[TVLKGTGSLVSSTEVKVIGQDTTIIKANHIIL-^TGGEPRPFPGVPFSSRILSSTGILEL 
EVLPKKLA I TGGGV ICCEFA3 LFHTLGVE I TV I EALDH I LAVNNKEVSQTVTNKFTKQG I 
R r LTKA3 ISAI EESONQVR ITVNDQVEEFDYVLV* IGRQFNTAS ICLDNAGVI RDDRGVI 
PVDETMRTNVPNIYArGDITGKWLLAHVASHQGVIAAKN ISGHHEVMDYSAI PSVI FTHP 
EIAMVGLSLOEAEQQNLPAKLTKFPFKAIGKAV^LGAGDGFAArVSHEITQQILGAYVIG 
PHA33L IGEMTLAIRNELTLrCIYETVHAHPTL^EVWAEGALLATNHPLHFPPKS 

CPri_0H 14 *i/H514 'M2 0 1 1 

CTS 1 "^'. hypothetical, (jrnr.fin 

K r [ MPFAKETEMQRTCWKt 'Lri:, V:JMH Vn.X *PYC.' \FI/jDPPVA^ iGFSiJCH I ^FPEGASK 
K FA f\DL F A V : J E DWRA VU XkJN PTQ HTN K Q V r P EWTV/LQ ' ' W P LAA LFLC IGLLAFAFLIL 
I.F::TU:;i UA/LTWPKNRAYrYi"; [ KIAAVAYRC'i RkLK, 

mnt L :;W[/:;NF t..inuLy neUr^ise 

UN IMVLKALA [ FRQDAMQHfjLKHRKE TWOF^'ED.";' V [H n , Di;r.APEGYWL.3TLKLQ i D[D 
Kl/INwV;r::;t:r'iy;RrrLHL,MTAYFAVYDAI;;[j;^LHLrf-['!r^FW't f AVr'^(FFLD^IPLOAO 
1 j'KMVYTLESPII tTLTT i''t , L:*t^EVF0DW[,l<T L!lA:"CLf TVn'NKTrLKSALYPTAKKFFFL 

wr:R^Ar^,Tr(U^NS(j(;Ki'';iiF:iLQWc/;LVFKAr[^^FPTrj:D:r!'K[XL/\HT3LENv^HDr 
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tTNVTVCAEEAKVNFTLGPVI f IKKDRENHPKTR IG^VEY'VAKTHEMITGPKAIALPIYA 
rPLLADKFKDQLLSLLCYDSLEYRLRYDrRL.LRDASFSFSAYLVTPODLDNGSLrYPNYC 
Y3 PTKCLMQWCMLS P KQA FtVKS EQVEDF LN ERG HLIQEFG FQT FINERP EGH LTYNVT 
EC^3VLI,FHYDVGDPSSTEIRF0TWT^YTNQ^ 

DAALRRLPNFFSS PPNLKDLL I EVHRQSRGKCLDLK P I LVGLGESRCWLPGVFLYREDIG 
FSLIPTPLQGLCFLPRVIPPENVPQFLTQYAQHERrLFPNPQTRPPESYELVIQSIHRPH 
PASPLHLQLELKTNLGSVPrGIALCGLKSKHTFLFTOAGFLDLKQNLFQFLKQFLSTQKC 
VT AENT\^IANITDfVFKLDALAPL3VTDrT r ANPEDLQFFSQLKAACLPPI PQNLFSSDHQ 

!,,,, ;\ • n.ii'/Mwr r.^iHi'i ,i :i. f r'i':vrv.L;:u\. v<fl r vti 

r •■. ■ n:...*N!i'. t ';v ' ' M :.f mv:t^ rrr.Pwr.KrvMA^rv 

VFDEIHMAKNKGSQ IHKILCRIDAyMKUJLTGTPI ENNLLEFKGLLDI I LPNYLPSDALF 
KKLFTK RCS3 EELEE 1 1 PSQDLLLKLTR P F I LRRTKKLVLP ELPDKVES I IACSLSPDQE 
KLYHATL^REKSHIQKLETPEEPATNFLHIFAlIJffllJCOICDHPAVFFKDPDQYKNYESG 
KWNAFVKLLKESLNAG YKVWFSQ Y IHMIRI ITLYLEE IG IKY AS IQGKSLNRKEEI ETF 
TT D PNC Q VFVG S LLAAGTG I NLT AGNW I MYD RWWN P AK ENQALDRVH R IGQKNTVF IYK 
LITEOTLEERIHYLIEKKrRLLDKVIASQDSNILHMLWREDLLTrLSYKDEHGTSDSEES 
PVDAPVEDDTGVLPPEDS 

CPn_0836 946960 945722 

brnQ -Amino Acid (Branched) Transport 

KMKKNASHKTNDKKSLSIWS IGGS I F AMFFGAGN IVF PLALG YHYNAH PWS X Y FGMMLT A 

VCVPLLGLVSMLFYSGDYQKFFFS IGRI PGMI FITAI ILLIG PFGG I PRAI AVSHATLI S 

LSEHKSAFIPSLPIFSAICCVLIYIFSCKLSRLICWLGSVFFPIMLVTLLWVIIRSFMIP 

THPMVQEF I PNARQAWLAG F I EGFNTMDLLAAFF FCS IVLISLRQLVAEEKHPTEEEIPL 

SFG^ISKKNKRSLALGFFLAAILLGhTTTI^FVLSAARHAGIXVW 

PNS I LAGVSVF I ACLTTE I ALVG IVADFLARWSFKKLNYASAVICTLIPTYL ISILNFE 

T I SHLLLPLLQLSYPAL I VLACGNI AYKLWNFRYSPVLFYLTLSLT I VLKLVN 

CPn_0837 947777 947145 

nth-Enodnuclease III 

LTMKQFILRTLNALFPNPKPSLEGWSSPFQLLIAILLSGNSTDKAVNSVTPQLFAKAPDA 
QSILDLPPGKLYQLrAPCGI03ERKSAYIYQLSQIL\mDFHGEPPNDMALLTQLPGVG 
ASVFLG I AYGKPTF PVDTH ILRItAQRWKI SEKKS PS AAE KDLARFFGHENT PKLH LQ L I Y 
YARQYC PALHHKI DNCP ICSYLAKEANSTRT 

CPn_083B 949196 947781 

thdF-Thiophene/Furan Oxidation Protein 

ISLNIYPNSFHLFNLKLGILSESSFNFS I FMLKHET IAAI ATPPGEGS I AWRLSGPQAI 
VIADRIFSGSVASFASHTIHLGQVIFEETLIDQAT.r J,[ J^SPRSFTGEDWEFQCHGGFF 
ACSQILDALIALGARPALPGEFSQRAFLM3KIDLVQA£AIQ^IVAENIDAFRIAQTHFQ 
GNFSKK IQEI HTL I IEALAFLEVLADFPEEEQ PDLLVPQEKI QNALH IVEDF I SSFDEGQ 
RI^CGTSLILAGKPWGKSSLLNALLOKNRAIVTHI PGTTRDILEECV/LLQGKRIRLLDT 
AGQRTTDND I EKEG I ERALSAMEEADG ILWVI DATQPLEDLPK ILFTKPSFLLWNKADLT 
PPPFt^SLPOFAISAKTGEGLTQVKQALIQWMQKOEAGKTSKVFLVSSRHHMILQE^A^ 
CLKEAQONLYLQPPEI IALELREALHSIGMLSGKEVTESILGEIFSKFCIGK 

CPn_0839 949230 950159 

psdD-Phosphat idyl serine Decarboxylase 

FLFIVSRGLVQKPQYIDRITKKKVIEPIFYEKT^FLYNSKLGKKLSVFLSTHPIFSRIY 
GWLQRCSWTRRQ IRPFMNRYKI SEKELTKPVADFTSFNDFFTRKLKPEARP IVGGKEVF I 
T PVDGRYLVYPNVS EFDKF IVKS ECAFS LPKLLGDHELTKLYAHGS I VFARLAPFDYHRFH 
FPCDCLPQKTRCVNGALFSVH PLAVKDNF ILFCENKRTVTVLETEQFGNVLYLEVGAMNV 
GSIVC/TFSPNC/I^AKGDEKGFFAFGGSWILLFLPNAIRFDITOLLKNSRMGFETRCLMG^ 
SLGRSQREEI 

CPn_Q840 950141 951544 

CT700 hypothetical protein 

ISERRNIJCTLKTFFGIAKRDKSQKV^IMV^VILWALAASIAIALVAKGYYRFVYFRRYAV 
QVIREVIU*SM£LKEWALAEQQLLPILKKRSYRRCCLFEYMRILRKMQRFEESEKLLAEAK 
KLGLRGPYFFLEIAYKAYRFGAFKECAQAFASVPQDLFEEEDAAKYASALVRLGDLDAAC 
SL I EPWI SPLSHQCTFVTMGHIYFTSKRYKDAIDFYNRANAU^CPVE\mTJI^OA 
SSYAKAGKLFRKLLSNPVYKEEALFN IGLCEQKLGRPGKALL I YQS SDLWSRGDALLMKY 
AAMAAMTORDYVI^PCWEI^RCSTFAKDYKCGIjGYGFSLCRLRKYGDAERVYCNLIQN 
F P ECLTACKALAWLCGVGYATLLGSE EGLMYAKKAVELDH SC ETLELLS ACEARCGNFDA 
AYEIQSFLSSRDTSLQEKQRRSQILRILRKKLPLNDHHIVEVDALLAA 

CPn_0841 951719 954640 

secA-Translocase SedA 

I KRHMLGFLKRFFGSSQERILKKFQKLVDKVN IYDEMLT PLSDDELRNKTAELKQRYQNG 
ESLDSMLPEAYGWKNVCRRI^GTPVEVSGYHQRWDMVPYDVQILGAIAMHKGFITEMOT 
G EG KTLT AVM P L YLNALTG K PVHL VTVNDY LAQ REX! EWVGS VLRWLG LTTGVLV SGTL L E 
KRKK I YQCDWYGTAS EFGFDYLRDNS I ATRLEEQVGRGYYFAI I DEVDS I L IDEARTPL 
1 1 SGPGEKHNPVYFELKEKVASLVYLQKELCSRIALEARRGLDSFLDVDILPKDKKVLEG 
I SEFCRSLWLV5KGMPLNRVLRRVREH PDLRAMIDKWDVYYHAEQNKEESLERLSELY 1 1 
VDEHNNDFELTDKGMQQWVEYAGGSTEEFVMMDMGH EYAL I ENDETLSPADK I NKK I AI S 
EEDTLP KARAHGLROLLRAQLLMERDVDY I VRDDQ I VI I DEHTGRPQPGRRFSEGLHQA I 
EAKEHVT IRKESOTLATVTLQNFFRLYEKLAGMTGTAITESREFKE IYNLYVLQVPTFKP 
CLRIDHriDEFYMTEREKYHAIVNEIATIHGKGNPILVGTESVEVSEKLSRILRQNRIEHT 
VLNAKMHAQEAE I IAGAGKLGAVTVATNMAGRGTDIKLDNEAVIVGGLHVIGTTRHQSRR 
IDRQLPGRCARIjGDPGAAKFFLSFEDRLMRLFASPKLiNTLIRHFRPPEGEAMSDPMFNRL 
I ETAQKRVEGRNYTI RKHTLEYDDVMNKORQAI¥AFRHDVLHAESVFDLAKE ILCHVSLM 
VASLVMGDRQFKGWTLPNLEEWITSSFP I ALN I E ELRQL KDT DS I AEK I AAELIQEFQVR 
FDHMVEGLS KAGGEELDASAICRDWRSVMVMH I DEQWR IHLVDMDLLRSEVGLRTVGQK 
DPLLEFKHESFLLFESLIRDIRITIARHLFRLELTVEPNPRVNNVI PTVATSFHNNVNYG 
PL.ELT*/*/TOSEDOD 

CPnJ)H42 9<S5015 9S4710 

OT702 hypothec j I protein ( f r.ime -shifcr. with 0843 J 
KYYTPPTrGRSPWSNIALKTT^'EPEYDCMQLLKTQSLLTTrA/DTLLNAPKDFPMSKNQKH 
ILFC I/\f IHTL.JHYAOFL IA(lNRRKrwiRYYND0VW3EWTPF I 

r:Pn_V"i4 - ^ ^ J? <fS5.> iO -fWftA 

CT70/ hyporhGf u al piorem (tcimf* nhitt with 084 3) 
fJKMKLI r;WRI KVMNYrxjYfyiM % ryrDll[>^I rEOr.t.DNCEEAA.rJLDKYQCTf'WVEE 
DLL [VL/jE'IVKKNT [ HOF.C> 

'*Pti_0>44 '«', t >/M 

yplK MP.i . t ;/';TP hiiidiiKj pr',r< in 

KNr'rETIMLh[A[LGRI'NVtTK:; f :[,fNr'[/ K!^;LAtVNr;OE^TTRI>HLYf;ELHAP;VPA0V 
[ DT'X//DHN: :EDYF0KH [ YNQAI/rnAKf lADVt .1 .1 .V [ D[ W/l rTEEDAHLAKLLLPLKKPL 
tLVAriK/-Df;RyEEt^[IIETYKlA;ri'rjI-/*/T-;TA[IDKI|[l/rr.ryjRCKLVANLPEPREEEEE 
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OLEEL.;VDRI»iECEAALPr;rrrFPDF:;cVFTECF"PEEPCTrPESPQQAPKTLKIALIGRP 
NVGK:;:; r ENGLLNEERC e idntpottpdn i d e LY3HKDRQYLF IDTAGLPKMKSVKNS IE 
WISSJRTEKA L^RADECLLVIDATQKL33YCKRiL3LI3KRKKPHIILINKWDLLEEVP^I 
EHYCKDLRATDPYLOQAKMLC I3ATTKRNLKK I F3AIDELHHWSNKVPTP rVNKTLASA 
LHRNHPCVtQGRRLR IYYAIQKTTTPL£FLLFINAKSLLTKHYEYYLKOTLKSSFNLYGI 
PFDLEFKEKPKRHN 
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.:amny-. r:-fR.'" 1 i , [,A[ J f " ' t'" . ; . v-v ' :^na> ;yqa ^f ^jW^pmlmn 

RPLED I D E ATNAS PT I VST IFPDVI3 ICV AFG I E '/VKQOGRLFEVATFRSDGEYKDGRH P 
DRIIFSSMREDALRRDFTWO , T^DPFEDKVFDFVEGTRDIEKKVIRAIGHPRLRFSEDK 
LRI LiRA I RF S S S LGFT LDPTTERA I 1 KEAPALVNSVS P ERIWQELKKMLKRQ PYGALSLL 
LKLKVL I FX FPELRDI PYSLI-RTT IEFARKFNPTHFPEILFLLPLFQGVSEEAATVAFGR 
LRI SNKELKLI ESWYEALPHFQNQSGNRVFWAHFLAS PTAPLFLELFSALQKDPSRQQHF 
I SRVQELESRLEQF ILR IKTSSPWSAPDL IAXG I SPGRLLGDLLREAEILS I ENECLDK 
EKILLLLQEKGFWK 

CPn_0846 959383 958112 

clpX-CLP Protease ATPase 

REHMNKKNLTICSFCCRSEKDVEKL I AGPSVY ICDYC IKLCSGILDKKPSST I SSAPVSE 
T PSQPS DLRVLT PKE I KKH I DEYVIGQERAKKT IAVAVYNHYKR I RALLHNKQVSYGKSN 
VLLLG PTGSGKTLIAKTLAKI LDVPFTI ADATTLTEAG WGEDVENIVLRLLQAADYDVA 
RAERG 1 1 Y I DEI DK IGRTTANVS ITRDVSG EGVQQ ALLK I VEGTTANVP PKGGRKH PNQE 
Y I RVNTENI LF I VGGAFVNLDKI IAKRLGKTTIGFS DDQADLSQKTRDHLLAKVETEDLI 
AFGMI PEFVGRFNCIVNCEELSLDELVAI LTE PTNAIVKQYMELFAEENVKLVFKKEALY 
AI AKKAKQAKTGARALGMI LENLLRDLMFE I PSDPTVEAIHIQEDT IAENKAPI I IRRTP 
EAIA 

CPn_0847 960019 959387 

clpP-CLP Protease Subunit 

KLFDEETQMTLVPYWE^y^GRGERA^lDIYSPiIJa3RIVMIGQEITEPLA^^^VIAQLLFI^ 
SEDPKKDIQ IFINSPGGYITAGLAIYDT I RFLGCDVNTYC IGQAASMGALLLSAGTKGKR 
HALPHSRMMIHQPSGGIIGTSADIQLQAA£ILTLJ<KHLANILSECTGQPVEKriEDSERD 
FFMGAEEAI S YGLIDKWTSAKETNKETSST 

CPn_0848 961556 960177 

tig/murl-Trigger Factor-pept idyl -prolyl isomerase 

VQAS SPAFPFKSNKKGCLVPRSLSNEQ FSVDLEESPGC I VSALVKVS PEVLNKLNKQALK 

KI^IEITLPGFPJCGKAPDDVIASRYPTNVRiCEIjGEILVTQDAYHALSTVGDRRPLSPKAW 

SN^CQFDLQEGAKVEFSYEAFPAISDLPWENLSLPQEEAASEISDSDIEKGLTNIGMFF 

ATjf^^ERPSQEGDFISISIJ^SKS^ENASSAAIFENKYFKI^EEEOTDAFKEKFIjGIS 

TG HRVVET I T S P E IQS FLRGDTLTFTVNAVI EVS I PE IDDEKARQLQAESLDDLKAKLRI 

QLE^AKDKQIvQKRFSEAEDALAMLVDFELPTSLLEERISLITRSKXJjNARLIQYCSDEE 

LEpXSELIKEAEEDATKAI^LFLTHKIFSDE^TISREEI^YMMDVCSRERFG<MPPK 

DI $N|)TI^ELVMSARDRLTYSKAI EHVLRKAELLAST PSA 

Cp|lb849 961752 965285 

mo^l?snf-SWF/SNF family helicase 
ADYTiHSYSRGEMLhTFRKLRRDFSANILQEXSKKLFECGAVIDAKIL 
GL^IYECEIEVDRSESDTVDSNCDCSYKYI)CQHIVALLFYLE&^ 
ET&HE I NEEVKKELKET FVAAATKEEERKDREHQKE ILtREYVHAANALSANPFFLPLEYL 
E KS? SAELAVLFVSVNEDTF APANQP I EFQ LVLRL PCRSKPFYISNI RT FLEGVLYQ E P I V 
LNSRRFFPTMQS FNAS DRKLIDLL I RYVRYPNHTTEEKLLKS AYLMPPALGVILAKMFEH 
QLADRGGG SI/3EKESF SGLFCGNLEE PLCWSLTPAKMKFNLD FFDMPYKALLKrPVI LVD 
DDEVQPEQTMLLESDAPGI IHHFVYHRFSPQIKRAHLRSFSRLRDI AI PEALFGSFRENA 
L P^-5fEYAE I ANVHLLNSFVTL PYVDEVRAICDMSYLDG ELEAKLHFLYG SLRVPAASLA 
L^YQDVRAFISDEGILARNLVEEPJ^MLEEWSGFIYDE^IDGAFRVKSEKKIV^ 
ANC^SlTraCPENLSGQFIYTlETIFELSFREGSDINYYEADLiO/HGLLKGVPLDLLWCCI 
SA^FLELPKAGC^SKGTRRGKVNSGKLPCILVLDLEKIAPVVQIFNEIGFKVLDDLVQ 
KC PJiWSLTG I SLDQFEALPVNFSMSERLI EIQKQ IRGEI EFDFQDVPQQ IQATLRSYQTE 
GVKWL £ R LR KMH LNG I LADDMG LG KT LOA 1 1 A VTQS KL EKGSGC S L IVC PTS LVYNWKEE 
FRKF^PEFRTLVIDGVPSQRRKQLTAI^RDVAITSYNLLQKDVELYKSFRFDYWLDEA 
. HHI^RTTRNAKSVKMIQSDHRLILTGTPIENSLEELWSLFDFLMPGLLSSYDRFVGKYI 
RTG^Y^GNKADNMVALKKKVSPFILRRMKEDVLJCDLPPVSEILYHCHLTESQKELYQSYA 
AS AJ^ELSRLVKQEGFER IHIHVLATLTRLKQICCH PAI FAKDAPEPGDSAKYDMLMDLL 
SSLWSGHKTWFSQYTKMLGI IKKDLESRGI PFVYLDGSTKNRLDLVNQFNEDPSLLVF 
LIS LKAGGTGLNL VGADTV I HY DMWWN P A VENQ AT DRVH R I GQS R SVS S YKL VT LNT I EE 
KILTLQNRKKSLVKKVINSDDEWSKLTWEEVLELLQI 

CPn_0850 965254 966390 

rnreB-Rod Shape Protein-Sugar Kinase 

LGKKYWNCCRYDFMS PHRNLFKLKNFSNRLYNRALGRFDKVFNFFSGNVG IDLGTANTLV 
YVRGRG IVLSEPS WAVDAQTHAVLAVGHKAKAMLGKTPRK IMAVR PMKDGV IADFEIAE 
GMLKAL IKRVTPSRSVFRPRILIAVPSG ITGVEKRAVEDSALHAGAQEVIL I EEPMAAAI 
GVDLPVHEPAAGMIIDIGGGTTEIAI ISLGGIVESRSLRIAGDEFDECI INYMRRTYNLM 
IGPRTAEEIKITIGSAYPLGDQELEMEVRGRDQVAGLPITKRINSVEIRECLAEPIQQII 
ECVRLTLEKCPPELSADLVERG^f/I^GGGAl J IKGLDKALSK^^^GLSVITAPHPLLAVCEvG 
TGKALEHLDQFKKRKGNLV 

CPn_0351 966378 9b8195 

pokA-Phosphoeno 1 pyruvate Car boxy k mase 

REFGIVMVWSTNIKHEGLKSWIDEVAKLTTPKDIRLCDGSDTEYDELCTLMESTGTMIRL 
NPEFHPNCFLVRSSADDVARVEQFTF ICTSTEAEAG PTNNWRDPOEMRRELHQLFRGCMQ 
ORTLYIVPFCMGPLDSPFSIVGVELTDSPYV^/CSMKIMTRMGDDVLRSLGTSGKFLKCLH 
:;VGKPL3 PGEADVSWPCNPKSMRI VH FQDDSSVMS FGSGYGGNALLGKKCVALRLASYMA 
K3QCWLAEHMLI IGITNPEGKKKYFSASFPSACGKTNLAMLMPKLPGWKIECIGDDIAWI 
RPGRDGRLYAVTfPEYGFFGVAPGTSERTNPMALATCRSNSIFTNVALTACGDVWWEGLTE 
OPPEPLTDWLCKPWKPGGSPAAHPMSRFTAPLRQCP3LDPEWNSPQGVPLDAIIFGGRRS 
KT I r*LVYFAL:;WKHGVTIGACMS3TTTAA I VGQLGKLPHDPFAMLPFCCYNMAYYFQHWL 
;;t-'AENR^LKLPKIFGV^FRKNNCOErLWP*'JF£;ENLPVLEWEFORTDGLEDEAERTPIGY 
lA'tl . r 0KFMr^je;r.NLDLOTVOELFnVDAE(]WLAEVEN ['"JCYLK E FG3DCP0Q ITDELLRIK 
::i-:i,kf.k 

i "l-ti_()H r >:; 'i^fi274 ')/,) (j i3 

i ."[ , 7LL hypor.h(-t U\il proti.-m 

I K I .U I r [jYYYI. rNTVTL^PUYrNFTPNVTTALSGGK ILTGA [ EL:;(*3ALFFQCLQDKAQG 
I.KHAUJEA/ijf^lAKALRrAOVQTG [ GYLPTEEo^Rr^ E GAiU E DRTMPTFTDDEVKAILQ 

ni-mfet:;k r i-vet JLOKvrK:;YLD.svTPrEfj edp^npela l r lny etllnnlkpkfaagst 

ITUADYf JALYAU* iDFVKE [ EALKAA DA P P K 3 K V H A FV/Q R I MT E YNNMQVLSYPVTDYLN 

vy rAijt.::r i N r taao evoo , t r l kn f y e l k [) e li j iwt Lt^jAvny p a daey n a p dag v e OS l 



lnlscnyrqltenmlpotct^lpoe: iao ersfop* ;vngt e ea^ntllpt^mrldtllgv 
eytycccati fgmsygtstpako^ r da inqeksywoar angfdvt.^dovfdqfatniqs 

GT3YRGIDLFKNNKVNEINP I FLJQAASFLRY PYNLM5RSMYCT E EDAANRS ITALDGLI 
SGWSTQIATFGTQKNSLDPjLLKYFCTMK^ 

I AS LG I QWTY SNKAAKYLMEL IKE I TTFQS AD I YY S L3 1 Y LKOMNLQAVAD P IGKAVGVL 
NDEKTRAMAD ITRCNK I KAA I DKMLVEI KADAELSKSQ I RELVDTLTNFKSQSDDL IRNL 
SC LLGFLSGLTLKAVNDPNATYEAFTAE I FTEPFNNWK RQLATFE3FV I QGGQNGITPGG 
0OOLLOAMESS0ODFSTFT^NCCLALCLESSA>KX)EOTLVSAALALL^ 

CT712 hypotneticai protein 

NIMHPKI EKRNSLPLTAVAPVFEESYH PSVATrVDYVDATTLSRHLTVLKDVI KEARNLD 
LGKAFLTSMKQGFINTGTELA I IQAS LADQSS RESRKKEEK I FHQHLGKAAPQAATATSG 
VQ PTADPVADKMPLQSAFAYVLLDKY I PAQEEALYALGRELNLSGYAQNLFS PLLDMIKS 
FNSAPrNYNLGSYISQTSGTANFAYGYEMI LSRYNNEVSQCRLD I ASTVKAXAALANMSA 
SVKANVS LT DAQ KKQIEDIIASYTKS LDV I HTQLT DVMTNLAS I TFVPG LNK YD PSYR I V 
GGDLS I IALONDEKVLVDGKVD ITTAVNEGGLI^FTTVLTDVQWGDLAG^C^LMLDLE 
IJCAMOX^SLVSASLKUJ^GMYTTVISGFKN 

CPn_0854 972849 971806 

ompB- Outer Membrane Protein B 

GPFI^INSKMLKHLRIjATE^FSMFFCIVSSPAVYALGAGNPA^ 

CNS YDLFAALAG SLKFGF"/GDYVFS E S AH ITNVPVI TSVTTSGTGTT PT I TSTTKNVDF D 
LWNSS I SSSCVFATIALQETSPAAI PLLDIAFTARVGGLKQYYRLPLNAYRDFTSNPLNA 
ESEVTEGLIEVQSDYGIVW3LS[^KVLWKE>3VSFVGVSADYRHGSSPINYI IVYNKANPE 
IYFDATIXjNLSYKEWSASIGISTYLNDYVLPYASVSIGOTSRKAPSDSFTELEKQFTNFK 
FKIRK I TNFDRVNFCFGTTCC I SNNFYYSVEGRWGYQRA IN ITSGLQF 

CPn_0855 974001 972994 

gpdA-Glycerol-3-P Dehydrogenase 

GLMKQH IGYI>GMG IVIGFCIJ^SIXANKGYPVVAWS RNPDL I KQLQEERRH PLA PNWI SPN 
I^FTTDMKEAIHNAFMI^^EGVTSAGIRWAEQIJCOITDLSVPFVITSKGIEQ^^rc 
IMLEVLGDSVTPYLGYLSGPS I AKEVLNGSPCSVWSAYDSQTLKQ IHEAFSLPTFRVYP 
OTDIKGAALOSAIJCNVIAIACGIAEGLSFGNNAKAGLVTRGLHEMRKLA^ 
GLAGI^DLCVTC FS ES S RNLRFGH LLAQG LT FEQAKAK I GMV.'EG A YTAL S A YQ VAKH H K 
I DMP ITTG I YRVL YENLDLKEG I A LLLQRNTKEEFL 

CPn_0856 975410 973995 

AgX-1 Homo log-ETOP -Glucose Pyrophosphorylase 

GSRDRNVRLTVMTESWSPSAMHVNSIjADKLKAINQEHILDIWPSLSPKO^QRLFQQLTS 

VDIDFFPJCO^^LLSSPTAIIJ<DFHPITSFASSGEDPERAHAGTTIiKEKKVA 

GSRIJCCDGPKGLFPVSPIKKKPLFQLVAEKVRAASKI^GOPLPIJ^Fm'SPIi^ 

ESNDYFHLDPNQVDFFCQPLWLLTLSGDLFLEI»IErrLALGPNGI^ 

WKNAGIEMVSVI PIDNPLALPFDVELCGFHAMSNNEVT I KAALRQTAI EDVG ILVKSHDS 

GKTSVIEYSEIPQNERFALNEIX3KLKYCLANIGLYCLS 

QLGHTSLNEKNAWKFEEF I FDLFCYS DHCQTLVYPRQEC FAPLKNLEGNHSPDTVRQALS 
DRERQLFHKVTGKKLSPNTTFELEADFYYPSTSTSLHWENKAFFEEPFFEAS 

CPn_0857 975808 975392 

CT716 hypothetical protein 

LLLLRQYI KTARG I SRLMRDRLGS LS L I LKVK I HKYLDTLHNOKRLALTVSRNIQATNKR 
I ADLHLERYEHF I SRDNI KHYD I LXjEYLKTLiQS SLYKQQS ESLRFLE I H HQQLQEL INRR 
K I I EKIKNNKYSKDQ EIGT 

CPn_0858 977115 975757 

flil-Flagellum-specific ATP Synthase 

RNS ETRNQ RRT RP S T FCFD SMNHLNK EKL H I HNWQ P YRACG L LS KVSGNL I EVDG L SAC L 
GEIXKISSTKDPNLI^EVIGFH^mTLJ^SLSPU^SVALGTEVLPLRRPPSLHLSDHI^ 
RVLDAFGNP IDKKEDLPKTHRKPLLSLPPSPMMRQPIDQ IFPTGIKAI DAFLTLGKGORI 
GVFSEPGSGKSSLiSAIALGSKSTINVIALIGERGREVREYIEKHSNALKOQRTIIIAAP 
AH ETAPT KV I AG RAAMT I AEYF REGGH EVL F I MDSLS RW I AALQEVALARGET L S AHQY A 
ASVFHHVSEFTERAGNNDKGS ITALYAILYYPKHPD I FTDYLKSLLDGHFFLTSOGKALA 
SPPIDILSSLSRSAQAIALPHHYAAAERLRSLLKVYNEALDIIHLGAYTPGQDEELDKAV 
KLLPSIKAFLAQPLSSYCYLDNTLKQLEALADS 

CPn_0859 977597 977055 

CT718 hypothetical protein 

VFLVTTPQSPGSI^QSHLPHPHDPWDTEPTSLPEDPNDKASQELHSLVHLFRKLSIHLLS 
EVEKTVQQLKPDLLELALLICEKFLYKKLENPQELALLLSTALQRHTTLRSLTPIKVFLH 
PEDLKTLTDWISTHELPttlKHAEFFPDTSCRRSGFKIETPNGILRQEISEELDHLLSVLT 
A 

CPn_0860 J ^S639 977608 

f liF-Flagellar M-Rmg Protein 

RTLVFFQNLAKKLTALG I $ FLGCLL IGGWSCA ILFGRSSNPSLAPTQVKTEKTSGNWLK 
LTOMGNPKLIESLTKKECLEKDLTSFHPIASAKVAIALSTEDDVMSPLHLSVILTLRKEE 
SLTPSLLFSITDYLCSSLPGLKREHISLSDNLGNLYIPEG ITVNSLFIHTLENYLGKIFP 
KEHFALAYHAKAEKPTLCLTLNENYIAHLTKEESEKIVAHTKHYLYONYDDSYDIVIETL 
PFARLQNKKSPPAKVLIGJMILVISLMIVALASr^LARHAYERVSPEPRKIKRGINISKL 
LEIIQKESPEKIALILSYL?PKKAEALLNRLPEDLKHOVLKYKL 

CPn_0861 -1-^752 978925 

mfU-NifU-related p:otein 

AS Y P FTWKF E.MTL PLE PM I FWS 3 LS MC/MK K F LT PHC ACT FS EEDA EAK EAHLVTG KOGH 
PLMGNCVTFYWLVDKKNGV I LDAKFQYFGHPYLI PLAEAVCNLVCGKSYSEAYKMTLDDI 
DKSLRVHAHOPALPEDSE^LYHFVIDALDTAVEQCLEIPLEDGSLPLONGPMNLDFEDAN 
PYSQSDWEALTHEOKLYALRATIAEKIGPYIAMDGGEVTVESLENFIVTIAYSGNCSGCP 
3SLG3TLNS EGQLLRAY I\ PELQVKVDE3GLNL3HP 

CPn_0862 -^O J24 j7<)122 

yfho MLtS-rpl.ir*\l L>:or^in 

GROTIFPITDCKT^rrSMFKE'QrmKAPPEFWLmiOVAEPPGERVKC^YALHL'DEi-GLPPC 
"ALKEj/'.EKTEES IROLVMLKDwII IFRF'/PHFI'ir/VIl EVI.AALVENLl'MFC*'RNM [ ELPAH 
DO0LLEN3LCRH0CU;TT\[WrVNHn^P [VEF^LEETL:;PR:U J Lr:;[^AAHCE J T(lVEOP 
LDPLL3LCKDRR ILLMLD! ."[J ELGRAPLTPE [ LI IAD [ ITF33AALG<1MG: j ICG I P ERK3L 
ERVFSSWFPPHTSA'ILa'F-' WAAMCW HER [■:ALP[.FTFHTi:NU'KKL EOELO'IVLP:! I 
0LAF:;EV0NRLPNEWA-\I PDrPAE:;[,AFHDK^ CYP.';L/IYERrorLAOVLONt a e:;pf 
LCH.':ALHFf.LTERfIKULL'" :KI,AHAMlil,A I KMLTPL.U;:::;S 
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EH MALL E LLRHGQSVWNEKNLF3GWVDI PL3QQC I EEAFSAGRAIQNLP IDC IFTSTLVR 
3LMTALLAMTNHHSKK I PY I VHEDPKAKEMSR I Y3 AEEENNMI PLYCS3ALNERMYGELQ 
OKNKKOTAEQFOEERVKLWRRGYKTAPPQGESLYDTKQRTLPYFEPCNILPQLQNGICJVFV 
SAHGNSLRSLIMDLEKLSEEEVLSLELPTCKPWYQWKNHKIEKHPEFFG 

CPn_0864 981658 982374 

yjbC-predicted pseudour ldme synthase 

YGVNVT KVR LNK F LAS AGVAS R R KCD E I E F SG SVTVNGR VAEG PFVL'/D PEDKVQVGGT S 

vfiLTK"vvi-Mv;i:' -f.Ki i e ■tvl'. : ; / r -yaltw^. rTVGPLrw'LT ; ;l:lvtn 

!v:rfanki rut'::.-,rT^- vll^vtit v akpi/ > : MEvTrr rp^nivr-- - rrp'TTVK 

IWSEGKKHEIRLFADAAGFPILELKRIRIGSLVLGGI^YGEYRELTDAELGTYMKLSD 

CPn_0865 982412 982942 

CT865 hypothetical protein 

SPMGYVFWIAGSIFLCISI^AYCQLYYSWSVLFSWYLLTVYALEKRHAL^ 
EDAQSQKEIDFLSQCDKLSWRAFLKNSYEIIPTFKEMEDLLSERVQGFLESIETIAEHDR 
A I LC I ENFWASKNL FD FE I AAY EEA VEKY LKL RQRA PLRLAS KLF R FLD VPS I RFS S 

CPn_0366 983494 982916 

birA-Biotin Synthetase 

NMKVIYYE I EEI PSTNTMAKSY1WLWDPYALWISTKCQTAGTGKFGKSWKSSKGDLLNT 
FCFF ITDLH IDVSRLFRLGTEAWALCKDLG I TEAK I KWPNDVLVHGEKLCGVLPETLPV 
EG LLGWLG IGLNGNTTKQALKDVGQ PATSLQ E I LGHP I DLETTRELL I HHLLGVLQ ENL 
PDSLATKSNRGNI 

CPn_0867 983405 984667 

rodA-Rod Shape Protein 

C I RI PQMHIGFCHCVRGGNFFYFVINNFHILE I YSLLNSNTIMRYHfCYFRYVNSWVFt.W 
LT LMLL SVWI S S MDPT AMLVT S S KG LLTNKS I MQ L RH F ALGWVVF F I CAYF DYHLFKRW 
AWVLYF FMI CALVGLFFVPSVQNVHRWYRI P F I HMSVQ PS EYGKLVIVIMLSY ILESRKA 
D ITSKTTAFLACLWALPF FLI LKE PDLGTALVLC PVTLT I FYLSNVHSLLVKFCTWAT 
IGIIGSUuIFSGIVSHQKVKPYALKVIKEYQYEJRLSPSNHHQRASLISIGLGGIRGRGWK 
TG E F AGRGWLP YGYT DSVF S ALGE EFGLLG LL FT LG LFYCL I C FGC RTVAVATDDFGKLL 
AAGITVYLAMHVLINISMMCGLLPITGVPLI 
Y 

CPn_0868 986733 984670 

zntA/cadA-Metal Transport P-type ATPase 

NFRNGLGVRDLHHFREYYLI INEI I ITGRYVFSRLFFTSFSAEWNTFFESGMSEDTSPL 
LSKQNRKLSHNLPLKSAYX*SLGTYLIAI^SFWIJiAKNI^^ 

NICQKWNIDrLMTSAAFGS I FIGGAI^AIXLVLFAISEALGQMVSGKAKSTLVSLKQL 
AP^GWLVLEDGNI^KVAINKIEVGNILRIKSGEVVPIJ^ 

KSGtfPGS IV PAG AHNMEG S FDLRVLRTGS D ST I AH I INLVI QAQNS KPRLQQRLDKYSSV 
YAiiSJ FAI ACG I ALLVPLFTS I PLLG PQS AFYRALAFL I AAS PCAL I IAIPIAYLSAINA 
CAl&iGVI^KGGVIIJmVSCNSWMDKT^^ 

SSSHP I AEA I VS YLMEQ KVS S L PADRYLTVPG EGVT?GYFNEQEAFVGRVErrGLGKVPSEY 
LEETEQK I YQAKQHGE ICSLAYVGNS FALFYF RD I PRPQ AXE I IQDLKDLGYPVSMLTGD 
HK^^ENTAE ILG I SEVFFDLTPEDKLAK IRELATQRQIMMVGDG I NDAPALAQATVGI A 
MG'EPtGSATAIEAADIVT^HDSLSSLPWIIQKAKQTKKWSQNIAI^ 
I Ipt|VLAVILHEGSWIVGLNALRLLKS 

CPjhr|0869 987479 986658 

CTJ^B hypothetical protein 

EGWRFFFPKTS ENTSDCRQHQ ILRK I MTQDPHDHFKSRTPEDH I KHVRDKHRVCKG EPHT 

TFjK©FFYHLANNALSTGWIFFIRTLFFLIPTNRA 

WAY^LSHRSMLEEKNEIEENFEQEXIELRILFF^C^FKDPLLQEMV^ 

M I REELY I RKEDL. PH P L IQGGSR ILGGLCGLA I FLP LVLC I S YTLAGVFS ALMVLVLSFL 

KARl LKNDK I S EMVWVLG I F I TS AS I I S S LMKLL 

CPhlT0870 938881 987446 

seir^-Seryl tRNA Synthetase-2 

TTXHPTQGFGGAVILPFSPISIARRIKKSCCSEKSS lYSHFCTLLLNNETSMLDIKIIRK 
TP^ETRLRKKDPKXSLEPVLSIJDKEVRQLKTDSETLQAQRRLLSQDIHKAKTQGVDAT 
NL iQEVETLAADLEK I EQHLDQ KNAQLHELLS HL PNYPADDI PVSEDKAGNQVI KSVGDL 
PIP^PPKHHLELNQELDILDFQAAAKTTGSGWPAYKNRGVLLEWALLTYMLQKQAAHGF 
QLWLpPLLVKKEILFGSGQIPKFDGQYYRVEDGEQYLYLIPTAEWLNGFRSQDILTEKE 
LPIi^AACTPCFRR EAG AAGAQERGLVRVHQF HKVEMFAFTT PNQDD I AYEKMLS I VEEM 
LTEJ^CLPYRLSLLSTGDMSFTASKTI DAEVWL PGQKAFYEVS S I SQCTDFQSRRSGTRYK ' 
DSOGitLQFVHTLNGSGLATPRLLVAI LENNQQADGS WI PEVLRPYLGGLEI LLPKDQ 

CPn_0871 988766 989899 

ribD-Ribof lavin Deaminase 

EYMEDF SEQQLFFMRRAI EIGEKGRITAP PNPWVGCVWQENRI IGEGFHAYAGGPHAEE 
I^IQNASMPISGSDVYVSLEPCSHFGSCPPCANLLIKHKVSRVFVALV'DPDPKVAGQGIA 
MLRQAG IQVYVG IGESEAQASLQPYLYQRTHNFPWTILKSAASVDGQVADSQGKSQWITC 
PEAP. HDVG KL RA ESQ A I LVGS RTVLS DDPWLTARQ PQGMLY PKQ PL RWLDS RG S V P PT S 
KVFDKTSPTLYVTTERCPE^IKVLDSLDVPVLLTESTPSGVDLHKWEYLAQKKILQVL 
VEGGTT LHT S LLKERFVNS LVLY SG PM I LG DQKR P LVGVLGNLL £S AS PLTL K S SQ I LGN 
3 LKWWE I S PQVFEPI RN 

CPn_0372* 989903 991216 

ribA&ribB-GTP Cyclohydratase & DHBP Synthase 

KEP. I FRVACLAS ESVNARESMI ETREEVGS ANFVSLERAI EDLRAGKFVIWDEASREDE 
GDLI IAGEK ITVEKMTFLLQHTTGWCAALSQERLLSLDLPPMVKDNRCRFKTPFTVSVD 
AAHGVTTGVSAADRTKWQLLADPK3K P EDF IS PGH FF PLAS SPGGVLKRAGHTESTVDL 
MELAGLQPCGVLAELVNEDYSMMRLPQILEFARKHN IAVIPVTSI IAHRMLSDRLVSKIS 
SAPLPT I YGDFT I HVYESLLEGMQHLALVKG^fVAGK3^r/LVRV^^3ECVTGDILGSKRCDC 
GEQL33AMSY I AEKGTCVLVYLRGQEGRG IGLGHKVPAYALQDNGYDTVDANLAMGFPVD 
SREYGIGAQILVDLKLTTIKLITHNP0KYFGLQGFGL3ITERVPLPVRISEDNEQYLRTK 
OEPMOHWLDLPCCNNRVQ 

':Pn_0B7 j 'Mll^K , -)'Hb , J4 

r it,E -Rib it y L IiiitIlIz ine :>ynt h.T-.i 1 

L-J M A VT I O Y NN F E E YM KTI - Kf ;} i UlAK N LR I A r VG3C Ft 1Q AMADAL V :\1 ?Q CT FL K FOG 3 C 

\j ; lmti r v Pi ;af e i pot r k k ll: : « : ekk rDA i vai-gvl iog lt di \ y no i vnovaag [ c :al^ 
L£i :r :u> trw. I vaamac iaworco t KCRHLGVSGMTTA I EMATLPTO I 

(;i-ri_(JH74 ■ i ' ) UM *><) 1 74'> 

'TV'; 'j hypor ln?r km L ptort'tn 

l r :;lnlk r ltkqr dr efa: ;mlk l lk r kvlvfplallmocn'; r g y At r\v ' : ; lqtn : ;otkvk 

["GwPT/Vf r l-'yKt.R^Yrr'LLWLTESrxiAPLLT^TP [DMAY'JEKLFMKKVP NLOfACRoM EHL 
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HLLIOG3RQ3YMCL3OrLP3EEa~;MTFK0F9TAilKOLLFFLN *t K3TDNTLR ILETAIVL 
RHVGC3AKA , m , FKPYFT03CF^3FYAKALFn/LRTFrEL^r'Vr^RLJFE0QETv'LLGLRRL 
GNYDSLLNLTET/PSAQLLGAWRTRRSLA I LDLYLYC LDTCGDK:4C30EFY INFAPLLSML 
QQHATI EEAF3P YFTYRANRLGFECT3RTDMTLVP LATLMNL3 P3EASTLAWSFKNLPSD 
EAENLVNSFYTVQGEH I PLTFRGLPSLVAGLSVATHGSTV5 PENRLRQLYSTMLSLLVKS 
LRSHREMLNKQLLPQGTVLDFSETTLSSGGLDVFAE5IAVR IHLNGAV3 INL 

CPn_0R7^ 003^3 '1^4022 

RLJRRSRRLFAPPDOTUKLTLgVCANr KTYAEK1 JEvDERDLJfVySSAAEKiSlSLALi 
QGE IKDALYRIREVHPLALI EALAENPALIEGMKKMQGRDWIWNLFLTOLSEVFSQAWSC 
GVI SEED I AAFASTLGLDSGTVAS I VCGERWPELVD I VI T 

CPn_0876 994123 995517 

dagA-D-Alanine/Glycine Permease 
SIATGETMLYFIEQLNKLSTSFCVFPMILLLGGFLTV^^ 

DSSSKANEVSSYEAVAG ILAGNFGTGNI AGMAVALACGG PGALVWVWLAALLG A I VOYAG 
SYLGSKYRKPEGNTGEF IGGP I ACLAFGMRKKILAGFFALFT IMTAFCAGNCVQVSC rVP 
LC AEGT PGKLLVG ILLALWI PVLAGGNNR ILRFSARVI PF I AGFYC ISCG 1 1 LFQHASA 
ILPAI KLICSSAFG IKAGLAG IGGYTLSQVISTG INRAVMATDCGSGMVS I LQANTKSKN 
PWDGLVT LVPPVIVMWCS ITMLVL I VSGAYSSGAQGTLMVMSAFKNS LGS LGSV I V I L 
AMALFGYTT ILTWFACAEKSLQYM I PGRRANLWLKAIYVLI I PLGGVIDMRM IWALSDTG 
FSGMVI LNC I AL I ALLKDVLSTNRDVALL K E R EC SV AD P VRNLD A 

CPn_Q877 995521 995982 

ybcL family 

RRRIMQLLS PAFAYGAP IPKKYTCQGAGISPPLTFVDVPGAACSLALrVEDPDVPKE IRS 
DGLWIHWIVYNLSTT ITNLAEGAEI FAVQGLNTSGKPVYEGPC PPDKQHRYFFTLFALDV 
VLPEEENVTRDQLYEAMEFHI IEQAELMGTYEKS 

CPn_0878 996660 995992 

SET Domain protein 

GCMSTVTTEPCSSIHISLNNDWRDSQPYSLDRASEI.L^^ 

HKSEKRRLISPLAKWLGKIJiKQDLJjCPPAPPVSVCWINAHVGYG ARDEIAPWTY IGEY 
TGIIJ^RQAIWMDENDYCFRYPMPLFTLRYFTIDSGKQGNVTRFINHSEQPNAEAIGWS 
EGLFHVI IRTVAP IYAGQEICYHYGPLYWKHRKKREEFIPEEE 

CPn_0879 997463 996645 

yycJ-metal dependent hydrolase 

YRILWKVSMCGFFPIASGSKGNSAYIXnTJSCKILIDLGVSKQVVTRELLSMNIDPEDIQA 
IFVTHEHSDHISGIKSFVKAYOTPIVCNLETARALCHLLDSHPEFKIFSTGSSFCFQDLE 
VQTFWPHDAVDPVAFIFHYT^EKLGFCTDLGWVTSWITHELYDCDYLLIESNHSPELVR 
Q S QR PDVYKKRVLSKLG H I S KQ ECGQ LLQ K 1 1 T PKLKKL YLAHL ST ECNT AELAL STVS E 
SIASITSIAPEIALAQGITSPIYFSRLEVACPR 

CPn_0880 999864 997444 

ftsK-Cell Division Protein FtsK 

PMIRERKKSRHPRLPTLPIAAKASLYXFFACFSGLSLWSFHRDQPCTONWIGLLGWSFSS 
FLLYFFGAAAFF I PLYFLWLSFLYFRRTPRPLFFYKAAAFLSLPFCSAI LLSMLSPVGTL 
PALLDTRLPKFILGNNPPVSYVGG I PFYLFYEGQSFCLKHLIGSVGTAL I FGFVMLFSVL 
YXCGGIALLKKKTFQDGVKKAFCSFFC/TCFKKLKKLINRRNYLPKP 
SOPSPRJ^VSETIILDGSISPLPQEEIPGSKKESFFLTPHPCKRFLTKFVEPQENKAKEGK 
TIAI^STPTWRESKGKERAALPKI^SLAVPENDLPQYHIXSKNREARPESLQAELERKA 
LILKQTLTSFGIDADIX3NICSGPTLAAFEVLPHSGVKVQKIKSLENDIALfCLOASSIRII 
AP I PGKAAVG I E I PT P F PQAVNFRDLL EDYQKTNRXLQ I PLLLGKKANGDNLWADLATMP 
HLI IAGTTGSGKSVCINTIVMSMIMTTLPSEIKLVI IDPKKVELTGYSQLPHMLSPVITE 
SREVYNALVWLVKEMESRYEILRYLGLRNIQAFNSRTRNKTIEASYDREIRETMPFMVGI 
IDELSDLLLSSSQDIETPIIRLAQMARAVGIHLILATQRPSREVITGLIKANFPSRISFK 
VSNKVNSQI I IDEPGAENIJ4GNGDMLVLLPSVFGTIRAQGAYICDEDINKVIQDLCSRFP 
TQYV I P S FHAFD DSDS DNSG EKDPLF AQAKTL I LOTGNASTTFLQR KLKIGY ARAAS LID 
QLEEARI IGPSEGAKPRQILIQNPLEG 

CPn_0881 1005646 1006209 

No robust homo log present in Genebank/EMBL as of 11/7/98 
NKKFAVHMPVPIDNSSRNLQEVPESLEDLEQHAEESPTHQSAESSSLQLSLASSAISSRV 
EQLSSLVLGMENSDFSSLRDVPIFSAIYESSTHTPVPTPLVGVGYINGSQSGYYDTQRES 
LH LSQLLGS RRVEWYNQGNFM EAS LLNLC PRR PRRD PS P I S LALL ELWEAF FLEH PPG S 
TFNPIFFW 

CPn_0882 1006169 1007404 

No robust homo log present in Genebank/EMBL as of 11/7/98 
NTPOVALLIQYFFGNGAFYVREALRLTPHAQNIVLVGICPSLYPEHPRSFYYRVSGDIGS 
RFDDRG FVNSGVETLPYSSGS FGI FW I S FTDPTFNFAIVNTFMRTAG I NEVS RPMTQDTE 
TSLIEMRDLSEQOEANWDSLEQEESI^IVGHWGGVSMTVTSSPNIFYRIQTLLGLPE 
TLAEAEENPTFPWST I DSLAEIMMNLVR I SDAVS I FWI FPIVDTTYNGVLLAVC IGFFG I 
NGICSTFLMLTNPRSRRDRWRNLRIb^CYRSI^SGMNLFDLSNNVRMAARRH^/TSCTVA 
LYAMVTLFGWTVAIQDALQYGFPSVRDAFYRYCLRHRYCLTQRNEDSLQTTGTRFQVTRT 
H L EDQOMVA 3 1 LNLS V FG L F FG FVGLMTT FGGL E I S PSC RWDAANN RTVG I F 

CPn_0883 1008904 100757 3 

dmpP/nqrt? -Pheno [hydrolase/ NADH ubiquinone oxidoreductase 
LYELF I KSG I F I VMTWLSGLYF IC IASLIFCA IGVILAGV £ LLSRKLF I KVH PCKLKIND 
NEELTKTVESGQTLLV3LL3SGIPIPSPCGCKATCK0CKVRWKNADEPLETDRSTFSKR 
QLEEGWRLSCOCWQHDMSLEIEERYl^NASSWEGTVISNDNVATFrKELWAVDPNKPIP 
FKPGGYLQITVPSYKTNSSDWKOTMAPEYYSDWEHFHLFDOVTDNSOLPADSANKAYSLA 
CYPAELPTIKFNIRIATPPFINGKPNCEIPWGVCSSYVFGLKPGDK ITVSGPYGESFMKD 
DDRPLIFLrCGAGS3FGR3HrLDLLLNKHSKREIDLWYGARSLKENIYQEEYEMLERQFP 
NFHYHLVLSErLPCDr/\AGWDKDDPTKTrjFLFRAFNU*QL3RLDNPEDYLYr/CGPPLHN 

rLKLU^DYrvrRs:' r r lpdfgh 
CTMl nyptirtvr umL pi or o in 

( : dc ml ' ; r r vrc f r - r l t , ■ ; : : t. p l i * a ee e aao: : k nt r vo i 1 a vm f . a r a n , k r y f n - wf < i * eo k r r 

KAMLKPKNnLAKGPKVTAMi; I [f"rVI)LI i'F.l i'l'V 1 1 AHA' 'A iKVCVl.Kf JA I 'U [ LKPNONKfJ 

i"Pri_f)H8 r i 1 (Holm) I'JO'M i j 

ytji A rRNA Mer hy t r i an . t < t ■ < 

a:: [r/rMrrfMONCi'HFt ;vt ( Mi ". ki ■* j: :ti /";!)■ ^i. KKKEFt .iJf'ji.KAi'i.vi'^iw iapi f pcjp 
:JLRr;pNKMr.rsFFO'['Yi;f;rK:;i/;rt '.:tkl'KKg l PvriY'LLriu-vrMDCLKi/r^r^uKH 

PEtJIAYFPrKNKi "rr;rVN'|VL;i vOfU- MVn.TT':< iTl'EYI'VNI'jM'lUL-'WKfl 
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n r a:; t yweekvaarg r styyetk llygap z r owl." l ps dgns asfs lr prg ffqpq itq 

AAKI I ETAKEF I NPECCETLLDLYCCACT IG IMLGPYVKNVTGVEI I PDAVASAQENI KA 
NNKEDCVEVYLEDAKAFCKRNENCKAPDVT I IDPPRCGMQSKVLKY ILRTGSPKIVYISC 
NPKTOFQECADLISGGYRIKKMOPtDQFPYSTHLENIILLEREIDP 

CPn_0986 1011288 1010908 

hctA-Histone-Like Developmental Protein 

RTLFMALKDTAKKMKDrXDSIOHD^KAEKGNKAAAORVRTDSIKLEKVAKLYRKESIKA 



CPn_0887 1011692 1014157 

CHLTR possible phosphoprotein 

MKKLYH PTLFLR PL I RLSLI FALSLTL ISGNF PQQKSFGHCCADMHSALI SGKNCEELFA 
DF I ERVLADRETLTARDWGTVWLVREYLLKC I PJCGDCDYGW ILQKIJ^ALRLPKDARKD 
LOILWHPJJ^PEQAPLRDVVIX}LFTIGCHESI^DHLLFELYTV^ 

QGDYKKA I ELAK ELVAALEKGSCS PH PE I VQ I EKT FLQKTLLALQ I KVAQEAQ ESC DALL 
T P YCLS E I AYTEAMDALVLR I ARGEVSRTNEVDSVLLSHALQHLPFAREKAI P ELEVL I D 
HGAYLESTLLYYAYFSLLELYHQNKDFASLERI^EKGDAVFVPEHPYFPEYGFFLGAYFY 
AXGKY E S AE KVF LQ 1 1 DPA VKLG AT F ARA YEYLGC I AYVQNHYEKAEEY F LRAYKSWGR E 
ESGIGLFLAYAVQKKKTACEDMLYHPKFSFTYRHLLDSLCSLSYPHGENKGSSAIQRVHR 
AVPEXSEIYSRCIYt^IKYPJ^VTYTHPIIELAYNOVRNLEKRNLEEICRDAQDPEYDKAL 
AFWGALQSGASVPRSL IESSDVDEAR IT I RCY EALY F HN PDA I AML PQAF S E ECNSWOTA 
LRLVWTLVRPKGAPNHAKYWDHLVLRPHGDSLYFFGYDLQEYLIGKEIJAIJCHLSVFAELF 
PKSSLLSLVYYLQGYS ESSALRKVGWFVKALEEFTEI SWSGEHMKTWAY IYYMVKLDLAD 
TY I SLGNFSQAVH ILEEVKEDWQVASHPKLHFLKGEIXTI^MELRWVEGLAYAYFQLHET 
AHLSNHLLEHVEKNLISPRSYRDYYGESLQRTLGLCQRFLGV 

CPn_088e 1015441 1014119 

hemG-protoporphyrinogen Oxidase 

AERRFCVTCRAIIIGAGISGLAAGWLHKKFPQAEILVTJDKFAYAGGFVRTESPQGFSFIJL 
G PKGFLTRGDGEYTLKL I H ELGLQNS L I FSDRAAKNRFVYYRGKAH KIS TWTLLRKGLL P 
SL I KDF RAPCYTQDS SVQDFLKRHS SQNFTSY I LDPL ITAI RAGHSS I LSTHMAF PELAK 
RFASSGSIXRSYLKNRSPKKSKTDRYI^LSPSMGTLITTIQEKLPATWKFSTSVTHIDC 
SPKEACVTTPSETFFADMVIYTGPLQQLPVIiP^GIE^ 

FSLPKGYGMLFAI)ELPLIX3IVWNSQIFPQATPGKTVT*SLLIEGKWRESEAHAFAIAALSE 
YLN I NQKPDAF ALF S S Q DGMPQ HAVG F L E RKER I L PHL PGNLK IVGQNI AG PGLNRC IAS 
AYHAICDLHTEETLAQPQSSL 

CPn_0889 1016841 1015462 

hejnN-Coproporphynnogen III Oxidase 
FllMi?NVNFKFLEGLHQPAPRYTSYPTALEWEPSDAAPA^ 

CQ^CLYCGCSVVLNRJ^DIVEAYINTLIQEMKLVVETIGFRPQVSRIHFGGGTPSRLSR 

ECS^IXFDHIHKLFDLSHAEEIAIEVDPRSIJ^OdEKADFFQNVGFNRV 

QEATORRQSHEESLKAYEKFKELAFQS INIDL IYGLPKQTKESFSKTIQDILAMYPDRLA 

LFSFASVPWIKPHQKAMKASDMPSMEEKFArYSQSRHLLTKAGYQAIGMDHFSLPHDPLT 

LA^KNKTL I RNFQGYSLPPEEDLLGLGMTSTS FI RG I YLQNAKTLEEYHNTVLRGTFATV 

K^ILTEDDRIRKWAIHKLMCTFTINKEEFFNLFGYEFDTYFIESRDRLISMETTGLIHN 

SPg|LKVTPIX3ELFVRVIATAFDHYFLNKVSKKECFSASI 

Cpfc|0S90 1017829 1016819 

heraE-Ur ©porphyrinogen Decarboxylase 

STliiS^SMSAFFDLIJCSQTASHPPIWIXRQVGRYMPPYQELKGSQSLKTFFHOTEAIVE 
ATrLijL^PSLLHVDAAILFADILSILJX3FAVTYDF 

LESflERTLKQKLPVPLIVFAAS PFTLACYLIDGGASKDFSKTMSFLYVYPEKFDQLI STI I 
EGTA I YLKTQMDAGAAAVQLFES S SLRLP SAL FTRYVTEPNRRL I AKLKEQAI PVSLFCR 
CF%ENFYTLQATQADTLHPDYHVDLHRIOKNI*MLSLQG^DPAIFLLPQEK^ 
VEjjUBeTYPNF IFNSGHG ILPETPLENVQLWS YVQRQL 

CF1^|Q891 1021079 1017819 

mEcl -^Transcript ion -Repair Coupling 

NF|tSMDFNPVNLDF S I SKEFKEETLPLLLEN I H PGATAFLAAKMFHDCRASVI M ITTPAR 
LDPLFENLRT FLDQAPVEF PSS E IDLS PKLVN I DAVGKRDHLLYS LNQHRAP I FCVTTLK 
ALIiEKT RS PQATSQQH LDLAVGDVLD P EATT ELCKS LGYSQVMLTS EKG EFSCRGG I VD I 
FPLSSPEPFRIEFWGEKIISrRSYNPSDQLSTGKVSKISISPAYTEEASGGNYSHSLLDY 
FS^LYLFDNLEILEDDFADISGTLSSLPDRFFSIGTLYDRISTSNQVYFSETPFPNVK 
NLKE^IRVI I EAFHRNMEASRQAI PILYPEQ I IQNDENPLLAF LQHLQEYMPPHGKPLKLA 
IY^KTKSLKEARALAETVARGDVEIYEKTGNLTSSFALVNEAFAAISLSEFASTKVLRR 
QKQRTHFSVTTEEVFVP I PGETWHIHNG IGKFLG I EKKPNHLNI ETDYLVLEYADKARL 
YVPSNQAYL I SR YVGTSDKAADLHHLNSSKWKRSRDLTEKSL IVYAEKLLQLEAQRSTTP 
AFVYPPHGESVI KFAETFPYEETPDQLKT I DQ I YNDMMSPKLMDRL ICGDAGFGKTEVIM 
RAAVKAVCDGHRQVI VMVPTT ILATQHYETFKERMAGLP I EI AVLSRFSOAKVQKLICEQ 
VASGQIDI I IGTHKLINKSLEFKNPGLLI IDEECRFGVKVKDNLKERYPMIDCLTVSATP 
IPRTLHMSLSGARDLSVIAMPPLDRLPVSTFVMEHNTETLTAALRHELLRGGQAYVIHNR 
T ES IYT LAET I RNL I PEAR IGVAHGQMGAEDLSNI FTKFKNQKTDI LVATAL I ENG IDI P 
NANTILIDHADKFGMADLYQMKGRVGRWNKKAYCYFLVPHLDRLSGPAAKRLAALNKQEY 
GGGMKIALHDLEIRGAGNILGTDQSGHIGTIGFNLYCKLLKKAVSALKKHTSPLLFNDDV 
KIEFPYNSRIPDTYIETGSMRIEFYQKIGNAESSEELTAIQEEMRDRFGPLPOEICWLFA 
LAEI RLFALQHG I SS I KCTANALYVQKCLSKS EOTKKTLPYALS PT PELLVKEVIES I ER 
GFLINAS 

CPn_0892 1023673 102104? 

alaS-Alanyl CRN A Synthetase 

EFFFMLSNTIRSNFLKFYANRHHTILPSSPVFPHNDPSILFTNAGMNQFKDIFLNKEKVS 
Y S RATT SQKC I RAGGK HNDLDNVG HTS RH LT F F EMLGN F S FG DYFKAEA I AFAWEVS LS V 
FMFNPEGIYATVHEKDDEAFALWEAYLPTDRIFRLTDKDNFWSMANTGPCGYCSELLFDR 
O P S FGN AS 3 P LD DT DG ER F LEYWN LVFM E FNRT S EG 3 L LAL PN K HVDTG AG LERLVS L I A 
GTHTVFEADVLRELIAKTEQLSGKVYHPDDSGAAFRVIADHVRSLSFAIADGLLrcOTER 
GYVLRK ILRRSVTfYGRRLGFRNPFLAEIVPSLAD.^MGEAYPELKNSLSQIQKVLTLEEES 
KFKTLDRtXJNLLQOVLKriSS^SSCr^GEDAFKLKDTYGMPIDEI^LLAKDYDYSVDMDTF 
H K L EQ E AK ER S R K NW(J SOOT S E S I /N ELH LT E F I GYDH LSr DT F I EA 1 1 C Y DH I VS S L 
Q EKO EG A [ VL K V P F Y A EKG( jQVG LDG E I FC f> FGT F [ VTI ITT : ' P K AG L I VH HG R I SQGS L 
' t 'V EAAVTAOVNR Y R R K R t ANNHTACHLLH KALE ITLGDH I ROAi 'S YVDDT K I P LDFTH PQ 
A [ PEDLL^ ■ I ETLVNF.:3 t RENEFVDI P EALYS rVMN.'J.I C [ KOFttlDKYSDWRVVSAOHS 
HKLOGGTliALiATf ;u IGFFR [TKP.HA7AMG IRR I CAVTGr.KAKATV!10Q"EVLE£IATLLQ 
VPkL/j [ Vl'RI.TATI .DF.PKQQDKRLI JCLEN' .L 10 TKLOKL [ UNCI iOR^G LTL L7HHLAEHE 
NMW/>jYAOv 'I. HUH I PFKL CSLWTTEKNGKY IV L:~RVSDnr. [T(.\ IV! iAQDLLKAVLTPCG 
( ;RWr:GKUJ:iAOG::APALPATt:VLNETLWOV^ 1 f 'T0L 1 

f kt P. ■[": ,uu:k(it o L,u,f 
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EFLAFCLG I "YSCCFY I EG LOG LI -M I NKF.LD IG [ LGK IAGAIKQ IS I ES IQKASSGHPGL 
PLCCAELAAYLYGTArLRQNPPDPHW[NRDRP/LJACHGSALLY3CLHLAGFDVSLEDLQE 
FRQLHSRTPOHPEYGETVGVEttTTCPLJs^^ 

YCLAGDGCFMECVSHEVCSFAGSLNLNNLWIYDYNNV'yLDGYI^EIS^ 
WDVYEIDGYDFTHIHETFS3 IKRCQERF\'LVI AHT I IGHGSPKEGTNKAHGS PLGVEGTH 
ET KQFWH LP EEKFFVP PAVKNF F AH K 10 EDRKAQ EQWL DEVR VWS KQ F P ELH EEFVALTS 
HKLPKNLESLVQSVEMPDS IAGRAASNK L IQVLVQH IPYLIGGSADL3SSDGTWIANEKV 
T HTYDFSGRN I KYGVR EFGMAT T MNG LAYSOVFR PFGGT FLVFSDYMRNA I RLAALSKLP 

'.TTIiL ' « " , * t *. *". f M " •/'/rsPW. • - \ 

' 'J'ROAf "*"'•■* 'M"! ' *-"'.. -I - ' - r % YTE.E'\r ^ ~ " \L VVKELEH 

LDK0VRW3FPcWELFE/\wLVLYKwJ r* GGLLG I RViS I iiAGS ALGWY K Y IG^EGLAI AMD 
RFGYSGASDDVSEECCFTTEO ILOR I L3Q 

CPn_0894 1026823 1025888 

amn-AMP Nucleosidase 

PRNDKNAKNLRRKHYKGERVSKHTSESRIAQDMLERYSGSSVKOFCPYLLLTNFSYYIOT 

FAK^GVPVFEGSMFSAAHAPHLKTSILOFKLGSPGAAI,TIDLCSFLPDIJ<AAIJ^G 

GLRSHYQVGDYFVPVAS IRGHX3TSDAYFPPEVPALANFVVQKATTEVLEDKKANYH IG IT 

HTTN IRFWEFT^KKFRKKLYETKAQS AEMECATLFAAGYRRNLP I GALLL I SDLPLRKEG I 

KTKSSGNFIFNTYTEI)HILTC^EVIEKLEKVMIJCRAASD 

MASGSETSDSDY 

CPTU0895 1026973 1027557 

efp-Elongation Factor P 

E I DCFMVRVSTS EFRVG LR I E I DGQ PYL I LQNDFVK PGKGQAFNR I KVKNFLTGRV I ERT 
YKSGESVETADIVERSMRLLYTDQEGATFMDDETFEQEVVFWEKLENIRQVn-LEDT IYTL 
VLYNGDWAVEPPI FMELS I AETAPGVRG DTASGR VTJCPAVTNTGAK I MVP I FIDEGELV 
KVDTRTGSYESRVSK 

CPn_0896 1027574 1027822 

CT753 hypothetical protein 

EKYFFFTVRNMEAKKI KELSKEAQLLKKLREKSRVLDEKNKRKAWVAKLVAMPES IRE I E 
KEERVETPQLFQAI AEKI LEEGV 

CPn_0897 1028794 1027853 

( phosphohy dro lase ) 
NFSLDSNTVDQKNKSNPRPMQEKPRHVHR 1 1 H I SDVHFHVL PVNPVHC FNKRLKGLLRKV 
FGLVHFQATTIGQRFPKVVRSLGADSVCITGDFSLTAMDGEFULAKHFV 
LPGNHDVYTLKSLAQQTFYTH FPNDQLQQNKVS FHK I TDHWWL I LLDCSCLNGWFSANGV 
VHLAQISAI ETFLLSLSPEE2^VI I ANHYPLLSSQNPSHDLIN^^^HL^^A^KKYP^CVRLYL 
HGHEHQAAVYNCADTS PSY I LNSGS I SLPTNSRFHV I DL Y PEKYQVHTM I LKNLLDFDAP 
LEI ANEATWDCQKL 

CPn_0898 1030511 1028904 

Mitochondrial HSP60 chaperon m Homo log 
TKiCRLGSVKILRIXGVCMSEQEKLSbTrT^ADKKL^ 

KERGFYAI SQTELSNSYE>n J GrVTJFAKAMVNK I HKEH S DGATTGL I LLHA I LQ ESYAALEK 
G I STHKL I AS LKLQG EKLQ EALQCQ SWP I KDAUCvTlN IIFSSLHMPTIADH FYNAF SWG 
PEGLIS ITKERZNDKTSMDVFQGFK I PAGYASTYFVS DTAS RLTR I AHPL I L ITDRKISM 
IHSLLPLLQEISEQNQHLI IFCEDIDPCVLATLVVNKLOX3LLQVTVVT I PQLSTTNQELA 
ED I AIJ : TGTH IC PCQEAS HVLAPEMVTLGSCLS I E I SESQTTLIGGLHI PEVLTLKTRQL 
AEEIRTTSCLF/TKKRLIKSTN^LQSSVAILPTDEDNEP 

VAIJ^ASLTLGTPKDDADENSIAISLLOKACCAPLKLIJ\m\DLDGDAVIAKLSSLGT^ 
LGISVFSREI EDLIAGG ILDSLATTST ILAQALDTAI LVLSSKI LI LENQYEI STL 

CPn_0899 1030848 1032215 

murF-Muramoyl-DAP Ligase 

NHRCCRQNYMRAMLLEDWVSLMLSDVSCPKCDKKITGFAI DSQQVQ PGDLFFALPGNATD 
G H Q FLKHAAT AGAVAA WS HDYQGDSFGLELI R VDDT KS ALQ EAG S NQC NL FQGTVJG I T 
GSVGKTTTKEFSKTILSSIYKTHASPKSYNSQLTVPLSLLMAEGDEDVMILEMGVSEPGN 
MQDLLRIVQPEIAVITHIKDQHAMHFPQGIQEILKEKSYILOKSKLQLLPKDSPYYLDLR 
SCSPTAEXFSFSFNDPIJ^FCYKAISGDSWIC/rPEENYCLPIAFSYKPAYTNLLIAVAL 
SWILEVPEEGVIRSLPEIJCLPPMRFEHSMRNGMQVINDAYNACPEAMIAALDALPLPSDG 
GK I ILI LGHMAELGRYSEEGHALVAEKAASRGDMIFFIGEKWI PVQSVLKSYSCEVSFFS 
S AQDVKD I L KQVARYGDVI LLKG SRALALES LLAC F 

CPn_0900 1032208 1033281 

mraY-Muramoyl- Pent apep tide Transferase 

LVFNFLGASM I PLI PMFLKQSLFFSIJVLTGMTTLVT.TVALGVPVMKWLKRKNYRDYIHKE 
YC EKL EMLH KDKAEVPTGGGVLLF I S L IASLLVWL PWGKFSTWF F 1 1 LLTCYAGLGWYDD 
R I K IKRKOGHGLKAKHKFMVQ I AI AAFTLIALPYIYGSTEPLWTLK I PFMEGMLSLPFWL 
G KVFCLGLALVA I IGTSNAVNLTDGLDGLAAGTMSFAALGFI FVALRSST I P IAQDVAYV 
LAAL VG AC IGF LWYNG FPAQLFMGDTGSLLLGGLLGSCAVMLRAEC ILWIGGVFVAEAG 
SV I LQVLSCRLRKKRLFLCS PLHHHYEYOGLPETK IVMRFWI FS FVCAGLGI AAVLWR 

CPn_0901 1033239 1034537 

murD-Muramoy lalamne-Glutamate Ligase 

FCMRRSRYSGCLMEIDMCQRILILGTGITGKSVARFLYQQGHYLIGADNSLESLISVDHL 
HDRLLMGASEFPEN IDLVI RSPG IKPYH PWVEQAVS LK I PWTD IQVALKTPEFQRYPSF 
G ITGSNGKTTTTLFLTHLLNTLG I PAIAMGN IGLP I LDHMGQPGVRWE ISSFQLATQEE 
H I P ALSGSV FLN FS RNH LD YHRNLDAY F DAKLR I Q KC LRQ DKTFWVAJEEC S LGNSYQ I YS 
EE I EE! LDKCDALKP lYLHDRD^CAAYALANEVGWVSPEGFLKA rRTFEKPAHRLEYLG 
KKDGVTtYINDSKATTVTAVEKALMAVGKDVTVILGGKDKGGDFPALASVLSQTTKHVIAM 
G EC RQT I ADALS EK I P LT LS KDLQEAVS I AQT I AQ EG DTV LL 3 PGC AS FDQFQS FKERGA 
YFKLL I REMO AVR 

CPn_0')02 10U507 1035241 

nlpD-MurairtLd.i^e (inv^sin repeat tamLly) 

AVDQPNAGSEVNMNRRDMV rTAV^VNA 1 LLVALFVT.^KR IGVKDYDEGFRNFASSKVTQA 
W::LEKVIEKPWAnvpr:RPIAKETLAAOFrESKPVIVTTPPVPWL';ET['EVPTVAVPPQ 
PVRETVKEEOAPYATVWKKODFLERIAPANfrrTVAKLMQlNDL'rTTOLKTGQVIKVPTS 
ODV'j'NnKTr^TOTANPFNYY [VOEGD^PWTI ALRNH IPLDDLLKMNDLLEYKARRLKPGD 



f I'ujyn) i i o \i,/4<j i oi, „i l / 

tt -.W iMvi inn I'toti m r^'.W 

';ko';[/:mkwfvi :;cur, i r: : t / ; l i Mvrr/rr;s ai ;vldr.glec:jThkal t povtylilglc ;v 

A:;LLYMMEMROFLKr.:['VI,L It lAAIALf VI' [ PflLC) ICPNf )Ai<\<ViU IFG^r.T IQV WVK 
Yl.Vf * I VALYri/IVi'"'! ,Y<jKOl.KMf- 1 .KLTA I LT [ P [ LL [A IEPL^ i.'AAV I ;JA:3[ ■ U'VF [M 
T:;VPIJ<YWLl,!'I 1 LrvraAr r rA[V-YPMI'7VMYRLWYLHPELDtKGIiOII0PY0AKrAA{^J(-; 
VIA / JKGI ilA.^I .yKLTYIiPKAOMfj/ FAA I /AKI-TGFLGMLVL I LLYMOFr^X )YA tA I KAS 
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.'JLEGAALAMVITLI I3^TOAFMNL^JVVoCLLP^KC"/NLPFFG00C3GLIA^MCCVTLLLKV 
YDEEX3K3HU;CRRFRRPHCP3SLGKG5FF3 

CPn_0904 LO 36320 L037396 

murG-Pept idoqlycan Transferase 

RYMMKKIRKVALAVGGSGGHIVPALSVKEAFSRBGIDVLLLGKGLKNHPSLQQGISYREI 
PGGLPTVLNPIKrMSRTLSLCSGYLKARKELKIFDPDLVrGFGSYHSLPVLLAGLSHKIP 
LFLMEONLVr^KVrJOr-FnRYARGrGVNFnPVTKHFPCP^E^/FLPKRSFSLGSPMMKRCT 

fiirrr'fh.'VVf :<.:;V i/^imh*' v valvkl /:;••' - ■ v . T 'ii:v;rK Yu ;™r\<.\\\' ni^r;, 
L' ^vkpf^eclluvllaallv l . ;iO f JA : lll . 1 r ' , : l : pvpg/ - .1 c. r.vMArr* 

DVLEGGTMI LEK ELT EKLLVEKVT F ALDSHNR EKQRNS LAAYSQQRSTKT FH AF I C ECL 
CPn_0905 1037400 1039835 

murC&ddlA-Muramate-Ala Ligase & D-Ala-D-Alam Ligase 
VHYMKGTPQYHFIG IGGIGMSALAH I LLDRGYEVSGSDLYESYTI ESLKAKGARCFSGHD 
SSWPHDA^A^/YSSSIAPD^A^E^YLTAIQRSSRLIiHRA£LLSQLMEGYESILVSGSHGKTG 
TSSL IRAI FQEAQKDPSYAIGGLAANCLNGYSGSSKIFVAEADESDGSLKHYTPRAWIT 
N I DN EH LNNYAGNLDNLVQ VI Q DF S R KVT DLNKVFYNGDC P I LKGNVQG ISYGYSPECQL 
H IVSYNQKAWQSHFSFTFLGQEYQD I ELNLPGQHNAANAAAACGVALTFGIDINI IRKAL 
KKFSGVHRRLERKNISESFLFLEDYAHHPVEVAHTLRSVRDAVGLRRVIAIFQPHRFSRL 
EECLQTFPKAFQEADEVILTDVYSAGESPRES I ILSDLAEQI RKSSYVHCCYVPHGDIVD 
YLRNY I R I HDVC VSLG AGN I YT IGEALKD FNPKKLS IGLVCGGKSC EHD I SLLSAQHVSK 
YISPEFYDVSYFIINRGGLWRTGKDFPHLIEETCGDSPLSSEIASALAKVDCLFPVLHGP 
FGEDGT IQGFFEI LGKPYAGPSLSLAATAMDKLLTKR'I ASAVGVPWPYQPLNLC FWKRN 
PELCIQNLIETFSFPMIVKTAHLCSSIGIFLVRDKEELQEKISEAFLYDTDVFVEESRLG 
SREIEVSCIGHS S SWYCMAG PNERCGASGF I DYQEKYGFDG I DC AK I S FDLQ LSQES LDC 
VREIAERVYRAMCGKGSARIDFFI^EEGNYWLSE3/NP 
I VDHF I I DALHKFDKQQT I EQAFTKSQ DLVKR 

CPn_0906 1040514 1039915 

CT763 hypothetical protein 

KWGSEVLELVNDSQLSREASAFRLDIDFFILNIYPFFRNFKNIELC FFLS I SQFNLDFME 
EFVAY I VKNLVTNPEAVEI RS I EDEDNES I KLE I RVAAED IGKI IGRRGNT I HALRT ILR 
RVCSRLKKKVQIDLVQPENGTDVIADQ0YICDNDSSNSTEI3TFGESDTCCSGHCHYDEDL 
NQEEQEEGNMHHSCECSNHH 

. CPn_0907 1040816 1040445 

*cutA Periplasm! c Divalent Cation Tolerance Protein CutA (C- 
Type Cytochrome Biogenesis Protein) 

FAFSKFLIIKSSMTAVLILTSFPSEESARSLARHLITERLASCVHVFPKGTSTYLWEGKL 
CESEEHHIQIKS IDIRFSEICLAIQEFSGYEVPEVLLFPIENGDPRYLNWLT ILSYPEKP 

PLjSDj: 

CPnpP908 1041607 1040780 

CTTg^ hypothetical protein 

ILA^LFMIIIKNNELMIRRFFKTLFPPGPQYSIXr^ASILrV^ 

LS &3FNP SP I RNL FLVS STLSKVPPTAI AEHLRLSADAPTYLH EFS I KEAES SLHALG I F S 
SLV££KSPDNKGITIrYTLC/TPIAYVGlSlRSOT 
. DLKt^QKLPKEKMLFTK I LLKELAMES PKI IDLSLSDAYPGEI IVTLSSGSLLRLPIKTLD 
RA&j^YKHMKKSPV I esekqyvydlrf PNFLLLKAL 

Cpfc!b909 1041592 1041966 

rs&y* Sigma Factor Regulator 

1 1 SET FTRFLLERLLMNLS AKEYGD 1 1 VI YLQG SLDAVSVPS VQEYLEQ F I qkkhlki AL 
NFf DVSYIS SAG IRLLLSNFKLVQSLGGKMCLCCVKESVTEVMRIAGLDQLr LLCQSEQE 
CLSKL 

CPn_0910 1041970 1043004 

mijL&is-tRNA Pyrophosphate Transferase 

FLYMLPFEFEFNTTSS PECDVCLDPQKLFVKLFKRT IVLLSGPTGSGKTDVSLALAPMID 
GE I VSVDSMQVYCGMDIGTAKVSLKARQE I PHHL ID I RHVQEPFWVDFYYEAIQACQN I 
LS RNKVPI LVGGSG FYFHAFLSGP PKG PAADPQ I REQLEAI AEEHGVS ALYEDLLLKDPE 
YAtSflTKNDKNKI IRGLEI IQLTGKKVSDHEWDIVPKASREYCCRAWFLSPETEFLKNNI 
QMRCEAMLQEGLLEEVRGLLNQG I RENPSAFKAIGYREWIEFLDNGEKLEEYEETKRKFV 
SNSWHYTKKQKTWFKRYS IFRELPTLGLSSDAI AQK I AKDYLLYS 

CPi|y911 1044079 1042985 

Fe-$ ; }j cluster oxido reductase 

SLL'MlFNVNYFMNLCKRI SFEEGLELFVSS P I ERLQERADA IRKERYPSNEVTYVLDAN 
PNYTNICKI CCTFCAFYRKPKSPDAYLLS FDEVRSLLORYVSSGVKTVLLQGGVHPGLG I 
DYLEELVRITVQEFPS IHPHFFSAVEI EHACRVSG 131 EQGLQRLWDAGQRT I PGGGAEI 
LSERVRKI I SPKKMQPC^WINLHKLAHLMGFRTTATMMFGHVENPEDILIHLQTLRDAQD 
SC PG F YS F I PWS YK PGNTALRRNVPQQ AS I ET YYR I LALGR I FLDNFDH VAASWFG EGKS 
LGAKALHYG ADDFGGVILDESVHKATGWS IQSSEEEICNI IRSEGF I PVERNTFYQH ISC 
TVSSL 

CPn_0912 1044120 1045760 

CT768 hypothetical protein 

WIMDNSDNSFHTLETEQGSFLNDELAVEEVASTESTEISDATLCFAEKKVAFILNKMRE 
ALTGSSQGSDLRLFWDLRKOCLPLFNEIEDTAKRADHWRCYIELTKEGRHLKGLQDEEGS 
FWGQIDLAITCLEKDILKFQEGTEDKIFKDREDNFLESQALDKHQAFYKQHHTSLLWLS 
^FSSKIIDLRKELINVGMRMRLKSKFFQRLSNLGNQVFPKRKELIEKVSQTFAEDVDAFV 
AKYFIGSDKETLKKTVFFLRKEIKNLQHAAKRLF\ r S3HVFAETRLKLSKCWDQLKGMEKE 
I RQEQGRLRWS AENSKEVRQMLAEVSSLLI EGNDL3KVRKDLEG ISKKI RALDLTHDDV 
I 5 LK K EMQQ LF DQLRE KQD AA EH S YQ EQLAKDKQVK V EAARS LA ER ITT FS KTC S EGN I T 
SET3REEWQTLKELLGKMSFLPPPEKISLDNQLNLALQTIVNFFEEQLLSSPDSREKLVNM 
RCVLKQRRERRQELKDKLEQDKKLLCSSGLDFDRAMQY3ALVEEDKRALEELDASILELK 
QQZQQLL 

CPn_091J Ll)4570'i L04VM5 

Nn robust homoloq present in (.^enebank/EMBL ot Ll/7/>)R 

H h K.-KYFR CEATDJAIAMRRNCIYAFDLDCTLLKC;N';';W f >FYCYGLL^\;LF3YKTLPP(' I 

YRFFRFKFFFOIFIIPSI [R 

< TliJ)'»H 1{MV>(]'> iOAr, i'i ' 

No cohu:;L humoUxj pro;-,ont in \ ^'iiebui/ / EMDL .it, ot L 1/7/ ok 
VPFVlDLPill-'Y'/r, rVTRI il jSSVPCDDLYEVALNrV^TLT'i 'DFYAPVLEKLEEArADTTCO 

v t [,i-y;.';:;it)f rvm 1 t aoqi,< 1 1 S3WY Ar.cYPE/j^ -xfot l ykkplt^ ;dkka<j ll^y ikk [no 

AR:;IITF:M^H ELrJhrKLMt^JEEKTVVRrOLlRLKKM.XKKYYWNrV 
-■[•ii_0'M l "> UMt.40 L UMMU ' 
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ybt*B-LOMP sup^rt.imi L/ ot^holoq 

FH LKK33TL3W I [ EY PKAG FM D'3 FO FDL L KVAAKA I DDKKGNNLWLDVR? 1 3EFTDYP/ 
FVEGSVNVHVKALANT I VEELKKQKV3 FLf IVEC ITDCNWWI DYGF r WHVFVSE I RGKY 
RLEELWKDGF t VT3KLLA3 

CPn_09ib 10468L3 1048084 

tabF-AcyL Carrier Protein Synthase 
LLNCVRVYMSKKRVVVTGFnVVSCLGNEVDTFT 

• : ' : v'mi •'. ' ' '>:"!*<: T -"t.**\p\ > Mf ! " ": • r 'AT"rr ; '" f 

DAAYQHLVSGRADMI ICGGTEAAVNR I JLEGFIANRALSERNDAPDOASRPWDRDRDGFV 
[jGEGAGILVLETLESALRRDAPIFAEMLGSYVTCDAFHITAPRDDGEX3ITACVIX3AL^ 
G I PKERVNYVNAHCTSTPLGDLSEVIJ\VKKAFGSHVTRNLR>^S^ 

WAIQAILTGKLH PT INLDNP I A£ r EDFDWANKAQDWDIDVAMSNSFGFGGHNST ILFS 
RYVP 

CPn_0917 1048054 1048539 

hydrolase /phosphatase homo log 

FNDI ILEVCTLVMMKTKYEYSFGVI P IKFFGTPDKNTLKACFICHTRGKHWGFPKGHSED 
KEG PQ EAAERELVEETGLSWN F F PKVL I EOY S F*NNEEO% r FVRK EVTY F LAEVRG D I HAD 
PMEICDSCWI^LQEGLRIJLSFPEIJ^LTVEADKFINNYLFSS 

CPn_0918 1049232 1048579 

ppa- Inorganic Pyrophosphatase 

E^I^SKKPLWAKPWSFTLTQDm , ESIXCYIEITPYDSVKFELJ}KATGLLKVDRPOKFS 
NFCPCLYGLL PQTYCGT ASGNY SG EQT RR EG I CODKD PLDVC VLT EKNIHHGNI LLQAR P 
IGGLRI IDSGEADDKI IAVLEDDLVFAEIEDISDCPGTVLDMIQHYFLTYKATPNKLIKG 
S PAKIEIVG I YGK KEAQ KV I QLAH EDY LS Y IGDTAEVN 

CPn_0919 1049375 1050430 

ldh-Leucme Dehydrogenase 

n^SLNFKEIKIDDYESVIEVTCSKVT^U^IIAIHCTAVGPALGGv^ 

DALRIARGMTYKAI ISNTGTGGGKSVI ILPQDAPSLTEDMLRAFGQAVNALEGTY I CAED 

LGVS INDI S I VAEETPYVCG I ADVSGDPS I YTAHGG FLC I KETAKYLWGSSS LRGKK I A I 

QGIGSVGRRLLOSIJFEGAELWAIJVLERAVQDAARLYGATIVPTEEIH 

RGNV I RKDNLADLNCKA IVGV ANNQL EDS S AG MMLH ERG I LYG P DY LVNAGGLLNVAAA I 

EGRW APKEVUJCVEELP I VLSKL YNOSKTTG KDLVALSDS FVEDKLLAYT S 

CPn_0920 1051423 1050431 

cysQ-Sulfite Synthesis/biphosphate phosphatase 
ILEENSMHSELP^QNIVESVVTErTTQLLNYRSEHRLVPFWEKSD^ 
LKQQLAKAFPNI PF IGEETLY PDQDNEKI PE ILKFTRLLT S S VSRDDL I STLVPPPS PTS 
LFWLVDP I DGTAGF I RHRAFAVAI SL I YEYRP I LSVMAC PAYNQTFKLYSAAKGHG LS I V 
HSQNUDRRFWADRKQTKQFCEASIuAAI^C^HHATRKLSLGLPNTPSPRRV^ 
AEGAVDFF I RYP F I DS P ARAWDHVPG AFLVEEAGG RVTDALG AP LEYRK ES LVLNNHAV I 
LASGDQETHETTLAALQNQLNVVPTDKLIAL 

CPn_0921 1051526 1052293 

snGlycerol-3-P Acyltransf erase 

G ELML I KLWRATYEGMYTF LVGALLKLRY RMQ VEGWDT LN I N P KQGCL F LANHVAEVDP I 
ILEYLFWSRFHVRPMAVEYLFHSRWQWFLNSVRS I P I PQLVPGKESKRSLERMNVCYEE 
ASRALNRGESLLLY PSGRLSRTG KEE rWQYS AYVLLHRVMECNWLVRVSGLWGS AFSR 
YKQNST PKLG P AFKEAF RALLRRG I FFMPKRFVKITLCQVDHLFLKQFPTKQDLNTFLAS 
WFNQGDDNLP I EVPYA 

CPn_0922 1052266 1053927 

aas -Acy lglycerophosphoethanol amine Acy It rans f erase 

QFAHRS SLR ITRKLRRMHDQRNRGHNNHNLRLRPGSTLLEAFLILCSEHEEG I ACFDEHL 

GSLSYRELRNAI IAVAIKVSKFS£I>RVGVMMPASIGAFIAYFGILLAGKTPVMyttWSQGL 

RELRACTKTVEVRRVLTSQOFIKHLTEVC^FVEYPFDLjMYMEDVRKRLSWWEKCRIGLYS 

KCSVPWLLRIFGVSGVESDDTAVILFTSGTEKLPKAVPLTHKNLMENQEACLKFFDPNTQ 

DVMIAFLPPFHAYGFNSCGLFPIJJMGVHVVFASNPLNPKKLVEF IDDKKVTFFGST PVFF 

DY ILKTAKKQNSCLESLRLWIGGDALECDTLYEETKKLQPQI ALYCCYGATECSPVI S IT 

TKESPRKSECVGMPIEGMDVLIISKETHIPVSSGEQGLIWRGNSVFSGYLGNHEHQSFV 

SLGGDQWYLTGDLGHIGPSGDLFLEGRLSRFVKIGGEMVSLEALES I LH EH FTENQNEDA 

GSLWCGIPGDKVRLCLFTTLATTIHEVNDILKSAETSS IVKISYVHQVESI PILGIGKP 

DYVSLNALAVSLFG 

CPn_0923 1053966 1055093 

bioF_l-Oxononanoate Synthase_l 

VC KES F LTTSDV I DFVTNDFLG FARS PT I YCEVS KRFQ I HCC?QF PH EKLG I RGS RLMVG P 
SSV I DD LES K I ASYHG APNAF I VNSGYMANLG LC HHVSR STDVLLWDEEVHM SWH SLS A 
ISGQHHTFHHNNLEHLESLLQCYRISSKGRIFIFVSSVYSFRGTLAPLEQIIALSKKYHA 
HL IVDEAHAMG I FGDDGKGLCHALGYENFYAVLVTYGKALGTMGASLLTSSEVKYDLMQN 
SPPLRYSTSLSPHTLISIGTAYDFLASEGEIARKQVFKLKEHFHECFDSHAPGCVQPIFL 
PHTCLEEAI SVLETTG rHVGWAFAKHPFLRVNLHAYNTVDEVNLLAQVMKPYLEKSSHR 
VH I NHEFHLWRELCCH 

CPn_0924 1057301 1055028 

priA-Pr imosomal Protein N' 

KRFTAKTKSMGYIESSTFRLYAEVIVGSNINKVLDYGVPENLEHITKGTAVTISLRGGKK 
VGV IYQ I KTTTQCKK r LP I LGLSDSE I VLPQDLLDLLFWI SQYYFAPLGKTLKLFLPAI S 
3NV IQPKQHYRWLKC3KAKTKEILAKLEVLHPSQGAVLK I LLQHASPPGLSSLMETAKV 
3QSPIH3LEKLGILDIVDAAQLELQEDLLTFFPPAPKDLHPEQQSAIDKIFSSLKTSQFH 
THLLFG ITGSGKTEI Y LRATSEALKQGK3T I LLVPEIALTVQTVSLFKARFGKDVGVLHH 
K LS DS DK3 RTWRQAS EGSLRILIGPR SAL FC PM KNLG LI I VD E EH D PA Y KQT ES PPCY HA 
PDVAVMPGKLAHAT\VLC.. C JATP3LESYTNALSGKYVLSRLS3RAAAAHPAKISLINMNLE 
P EKSKTK I LF^QPVLKK [ AERLEVGEQ^/L t FFNRRGYHTMV3CTVCKHTLKC PHCDMVLT 
FHKYAf r/LLCi-iLCN^JPKDLPQ^CPKCLGTMTLQYRGSGTEKIEKILQOIFPQIRTIRID 
"DTTK TKGS H ET LLRO F 'V IK ADVL [ CTQM T AKGMNFSAVTLAV I LrJGDSGLY r PDFRAS 
EOVFOLrTOVAGRSOR^NLlXirrLIO^FLPDHPTIHSAMPODYSAFY^QELTGRELCEYP 
PFrRL[PC:FMCK('rK^PWEFAIIPVHNILKE0LE3TNPLMP'/TPCGHFKrKDTFRYQFLI 
K3AYV r PVNKKLHHALMLAK L' ; I >KVK P'M r IIVDE'MTTFF 

r \'\\_l)')2S 1(5'- f j 1 1 > 1 'j'j ! 

'•TVV'i hiyporhoru.il pt .t^ in 

KHWL.FMEr^ONFHDTt.COLI.DP /.'JPFJ. r^rLAiU.LNVTLPflTA [:-A':v;:U PEKAVEVPN 
AEF^PITPPPF'TNLJOflKTKI ' ;r-WKi'V Pr.HPDL^ONAILKEKYPALK^SLPAPK I PCS I 
FVYKnnNEEVLFFNRLAKr 1 .'IVv-I-l 'PTK1 /I'I, [IIAKTNT F'^^PNFFLALAPL,^/ I RYK IP 
TTDYI (O^LTQNGC I F[ ,PLY: :t .E7FK [/XJLKRMLWA I LhJPLPFAYTPK^:' 
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ThiorcdoxLn DiGulEi.de [somer^se 

CH HTQQTYl/TR F F K D3 DMK FWLQGCA FVOC LLLT LPCC AARR RASGENLQQTR P I AAANL 
QWESYAEALEHSKQDHKP ICLFFTGSDWCMWC I KMQDQ I LQSSEFKHFAGVHLHMVEVDF 
PQKNHQPEEQRQKNQELKAQYKVTCFPELVF I DAEGKQLARMGFEPGGGAAYVSKVKSAL 
KLR 

'^rn_00^7 £\Xf nS191H 10SR^70 

1 lii/r. J ; k; i ' ' t • ■ ir. , i* it! . i < 

:.:.:m;:.:i': , !'':;! ; , ^"'tM::'/' : /,u-rr ; :t- v • : \ ~ : :r.Tr mitt:. ;rrv;r;:v^ 
FI ISI ILFLPLALLWVLKKTCQFFILPSSI ISQSMSKTAVAIRRMTFLSHIKQLLSLKEI 
SAADRWIQYDDLWDSLAIKIPHALPHRWILYSQGNSGLMENLFDRGDSSLHQLAKATG 
SNLLVFNYPG IMSSKGEAKRENLVKSYQACVRYLRDEETGPKANQI IAFGYSLGTSVQAA 
ALDREVTDGSDGTSWIWKDRGPRSLADVANQ ICKPIASAI IKLVGWNI DSVKPSERLRC 
PEIFIYNSNHDQELISDGLFERENCVATPFLELPEVKTSGTKIPIPERDLIJ^LNPLSPNV 
VDRLAAVISNYLDSENRKSQQPD 

CPn_0928 1061035 1059884 

*CHLPS 43 kDa protein homolog_3 

RRKDFAFTLLNLSNRSDILSGI FSNPHPVSYFSSTHAKQLSDFSKKHPI LTKIVTI IVKI 
FKLLIGLIIPPl^IYWLCQLVCSLALFPRSSMLYSVIJ<TCFKKYRLE^EIQDYFVKNLDP 
SFKDPAVSESKRITIQQDHLTIDTLAIHFSTARPKRWLLISLGSGDFLEDMIGLKDSLFL 
SWKELAKIJ^ANILIYNYPGVKSSTGKLNLENI^TAHN^ ITYG 
YSU3GWQSAALQKNPFTNSETSWVAVKDRAPHSLPAAANSFFGPIGKLIAVLARWKMDA 
EKNSRELPC peilvys ADRFRPSEVGDDTALLPEFTLAHAIKRTPFARSKKF igevnllh 
SS PLKH PTIQKLAEAI LESLSRKN 

CPn_0929 1062301 1061186 

*CHLPS 43 kDa protein homolog_4 

EKFMAPIHGSNAFVEDILHSHPSPQATYFSSTRAQKLHEFKDRHPVLTRIASVirKIFKV 

LIGLIILPLGIYWIXQTLCTNSILPSK^LKIFKKOPOTKTLKTNYUIAI^ 

SMRRVP I I^DNVLIOTLEICLSQAPTNRWMLISLGSDCSLEEIACKEIFDSWQRFAKLIG 

ANILVYNYPGVMSSTGSSSLJCDLASAHNICTRYLKDKEQGPGAKEI ITYGYSLGGLIQAE 

ALRDQKIVAOTDTTWIAWDRCPLFISPEGFHSCRRIGKLVARLFGV^KAVERSQDLPC 

LEIFLYPTDSU^STVRQNKU^ELTIJ^IKNSPYVQNKEFIEVRLSSDID 

VALATP ILKKLS 

CPn_0930 1062851 1063330 

No robust homo log present in Genebank/EMBL as of 11/7/98 
NKJ^LATCSTGU>MVPHTQVHHAIJ7rR^ 

VIGSMI LILFSS IALI YLYKKTREYDQ IALEPLPEMI SKDQS IIDFVKTRDYASLEKKAT 

FAY3*HTHYYDGSMVFYREI prfmlgsylalrkdmdroalf 

CPnTp931 1064078 1065718 

lysSrLysyl tRNA Synthetase 

ID^^GWKSDIYTNILEERMTARAEYIJDHEIJFLYRSHKLQELSEIiGVVLYPYEFPGWS 
CEI?J^TFASQELGNSEAAMSRSTPRVRFAGRL 

EFT^t^GLSEDAEITPIKFIEKKLDLGDILGIIXrfLFFTHSGELTVLVETVTIiCKSLLS 
LPDKHAGLSDKEVRYRKRWLDLISSREVSDTFVKRSYI I KLIRNYMDAHGFLEVETPILQ 
N I YGGAEAKPFTTTMEALH S EMFLRI SLE IALKKI LVGGAPR IYELGKVFRNEG I DRTHN 
PEFTMI EA Y AAYMD YK EVMVFVENLVEH LVRAVNHDNT S LVY SYWKHG PQ EVDFKAPWI R 
MTMKES I AT Y AG IDVDVHSDQKLKE I LKKKTT F PET AFATASRGML I AALFDELVSDNLiI 
AP^8 ; ITDHPVETTP^KTLRSGDTAFVt:RFESFCU3KELCNAYSEI^PIRQRELLEQQH 
TK&EELPDS ECH P I DE EFL EALCQGM P PAGGFG IGVDRLVM I LTNAAS IRDVLYFPVMRR 
FDAJEKTN 

CPr|~&932 1067160 1065721 

cysS-Cysteinyl tRNA Synthetase 

VK SSfcVMAFSH I EGLYFYNTASQKKELFF PNHT PVRLYTCG PTVYDYAH I GNFRTYVFED 
I L KRT L VF FG Y S VTHVMN I TD VEDKT I AG AS KKN I P LQ EYTQ P YT EAF F E CLCTLN I ARA 
DFYPHATHY I PQMI QAITKLLEQG I AY IGQDASVYF S LNRF PNYGKLSHLDLSS LRCC SR 
I S ADEYDKENP S DFVLWKAYNP ERDGVI YWES PFGKGRPGWHLECS I MAMELLGDS LDI H 
AGGVBNI FPHHENE I AQSEALSGKPFARYWLHSEHLLI DGKKMSKSLGNFLTLRDLLHQE 
FTG^^R YMIXQSHYRTQLNFT EEALLAC RHAL RRLKDFVSR LEGVDLPGES PLPRTLDS 
SSQftfEAFSRALANDLNVSTGFASLFDFVHE INTL IDQGHFSKADSLY I LDTLKKVDTVL 
GVLg^TTSVC I PETTVMQLVAEREEARKTKNWAMADTLRDEILAAGFLVEDSKSGPKVKPL 

CPn_0933 1067532 1068578 

predicted disulfide bond isomerase 

PVILLQNIKRCSLKQLKVLATLLLSLSLPTLEAAENRDSDSIVWHLDYQEALQKSKEAEL 
PLLVIFSGSDWNGPCHKIRKEVLESPEF I KRVQGKFVCVEVEYLKHRPQVENIRQQNLAL 
KSKFKINELPCMILLSHEEREIYRIGSFGNETGSNLGDSLCHIVESDSLLRRAFPMMTSL 
SLSELQRYYRLAEELSHKEFLKHALELGVRSDDYFFL3EKFRLLVEVGKMDSEECQRIKK 
RLLNKDPKNEKQTHFTVAL IEFQELAKRSRAGVRQDA30VIAPLESYI SQFGQQDKDNLW 
RVEMMIAQFYLDSDQWHHALOHAEVAFEAAPNEVRSHISRSLEYIRHQS 

CPn_0934 1068948 1068526 

rnpA-Ribonuclease P Protein Component 

YFVHPLTLPKOSRVLKRKQFLYITRSGFCCRGSOATrr/VPSRHPGTCRMGITVSKKFGK 
AH ERNS FKRWREVFRHVRHQLPNCQ I WFPKGHKQP PVFSKLLQDF INQ I PEGLHRLGK 
TKATTGGECTPKSEKCVTAPR 

OPnjms, 1069100 1068957 

rl34-L34 Ribosomal Protein 

EDTV K RT YO PS K R K R RNG VG FRT RMAT RNG R K LLNP P P RHG RH S LVD L 

CPn_0'JJ6 L06O330 1069470 

rL3f-i-L3h Ribosonul Ptotein 

YLMKV.^CSVKADP^IvGDKLVRRKGRLYVIMKKDPNPKORQAGPARKK 

':i'nj)'.ii7 li><v>4H7 L0t.'>7'>* 

r:;M-:JM R LbosoiiM I Pi or em 

7KRMAKK::::VAI<FAKRRRLVEANFKKR';DLRKIVK':LT/ r ;CECKCNARI^LMKMKRDTSP 
TRLI1NK -IJ.Tf ;RH<<JYU<KFAI:JRlCFRgMA^MC'Crf-',yiKA ,W 

• Tn_0'» Hi ;o v,S L0'>'<>M'> 

'TVftrt hyp.>rh-t i, .il ptor.-m -ll.-.j.tor r*/n p« pt uU- p«-r i plasm lc I 
'J? ( ML: '/ "I'l .KTM[ A ' I [[ .1 A 'YV I U ) AY I ADKKKRfr/ ^'MfFAiWWV^F I GLWLLLL 

i :hrnai .i-:k r vnr >PFPN. f ;n: .fddi.kk'jl/V ;mdc e p v/;DLon i v i ijtf.kwfylnkdremv 
';i'i:;Fi-:i-:i.vvi.i.KOKTYpn::wvwKK< smkdwohvkuvpslwal.kfmik 
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CPn_093'' 1070h2 f; * 107LM5 

CT790 hypothetical protein 

KINRVrrrRLSLTLIIGTVLYFFSEEXELIGGGKMEKQNLKLDVKEIEFPETVFSRDIETR 

VIQVIILHCIAKINGVSIXXjCNLIDALFGRDIERWKGIWEQOSKNHLVXV^ 

VS IPEKTEEIOGCrvSEISEYTGLHVAAVHVI IKGLTQPKDR IDEEIEEEV3VQDLPSPE 

DFLLENSEG 

rpr._no4Q 1073010 1071204 

Lt.vMPrn ir" " ">~t»)\r :>". *;• :vvr , : \ - '.*ir;.\ '"mckxv 

ERIPFLMKKTASIETIWSNETEALLLENNLIKGHHPKYN\'LLKDDKTFFCLAISLSHSW 
PKVEAI RTKAITSSORQLI FGPYVSAEACHTLLEVI SQWFPLRTCS DREFALRKRPCILY 
DMKRCLAPCVGYCTPEEYQGTLDKAILFLKGKIEEVVKDLEKVIQKA5t»^EFEQA^ 
RTLSL I KQAMAKQQVEKFHFQNIDAI^LYRHKQRTILTLLTVRSGKLLGARHFSFFENAQ 
EDQDLLSSF ILQYYVSQPY I PKE ILTPLPLEF PTLSYVLNAESPPRLRS PKTGYGKELLD 
LAYRNAKAYAATTLPS STL PYQD FQN I LRMSQY PYR I EC YDNAHMQGAHATGVY I VFENN 
G FDPKQYRTFS I DSEKTQNDLALLEEVLL RRFHSLTTAL PDM I WDGGKTHYNKTKK HQ 
TLNLTG IQWT IAKEKSNHSRGLNKEKIFCETFPEGFSLPPTSNLLOFFQI LRDEAHRFA 
ISKHRKKRGKAI^EQEKrPGIGEVKRKRLI^KFKSWKQVMLSSQEELEAIPGLTKKDIAV 
LLARQKDFNKSD 

CPn_0941 1075504 1073018 

mutS-DNA Mismatch Repair 

VMTEKKPTPM>ffiC^C^KEKAGDSVLLFRMGDFYEAFYDDAVLLSQHLELTLTQRQGIPM 
SGIPVSTVDTYVDRLIGKGFKVAVAEQFGEPAKEKESKKIGPMARDIORFVTPGTLLSST 
LLQEKFNNY IVAINRXGSLFGFACLDLSTGSFF I EECEWTKELVDEICRLAPSEVLSCNK 
FYNKETAIVMQLOQHLPX.TI^STY'ADWAFEHKFASOKLTTHFQVASLIX^FGL^ 
AGGLLSY IQDKLLLPTKHXAr PQTRGKQQKLL IDTASQVNLELLAPLNDPQGKNSLLRIM 
DHTSTPMGGRLLRQ IL ISPFYNPKEILVRQDAVEFFLRQVTLRKNI KTYLCQVRDI ERLM 
TKVTTGIjAGPRDIGTIJU^SFSAGAQIYEQIJ^ATLPEFFI^ 
NGDLPLRVStX^IFVDEFllNDLKRIJUiNQEHSQEWIWEYQEllIRKETGIKKLKIC 
GYYIEVSS EFAPQLPKDF I RRQSRLHAERFTTI ELQOFQDDMSN I s eklotletoffkdl 

cshilquiteiiju^sqsijudijdyii^ 

vdtgkf i pndtemrgsotrm i lltgpnmagkst:' i rq i allv imaqmgsy i paksah igv 
idkiftrigagdni.skgmstfmvemaetanilhnatdrslvii^evgrgtstydgiai 
avveyllftdkkkaktlfathykelttledhc PHVENFHAGVKDKAGQ PVTLYE I LKGHS 
qksfgihvarlagfplcwsraqqilrqlegpesitrpaqdkmqqltlf 

CPTU0942 1075955 1077754 

dnaG/priM-DMA Primase 

NC S ITKLRT AMYTE ES LDNLRH SID I VDVLS EH I HLKRS GAT YKACC P FHTEKT PS F PVN 
PAGAHYHCFGCGAHGDAIGFLMQHI^SFTEAILVLSKKFQVDLVI^PKDSGYTPPOGLK 
EEIJ^INSEACTFFRYCLYHLPEARHALQYXYHRGFSPDTIDRFHLGYGPEQSLFTiQAME 
ERKISQEQLHTAGFFGNKWFI^ARRIIFPVHDALGHTIGFSAPJCFLENSQGGKYVNTPET 
PIFKKSRIL FGLNF SRRR I AKEKKVI LVEGQ ADC LQM I DSG FNCTV AAQGT AFTEEHVKE 
LS KLGVLKVF LL FDSDEAGNKAALR VG DLCCT AQMS VFVC KL PQGHDPDS FLMQRG S SGL 
I ALLEQ SQ DYLT FL I S EKM S SY ?KFG PRE KALLVEEA I RQ I KK^G S P I LVYEHLKQLAS L 
MMVPEDMVLSLANPQVTAEPQN I P I KQKVPK I H PH IVMETD I LRCMLFCGSNTKI LYTAQ 
FYFVPEDFKHPECRKLFAFMISYYEKYRKNVPFDEACQVLSDSQILQLLTKRRIJ^ 
TIFVOSLQKMADRRWREQCKPLSLi^NIQDKKLEILEDWQLRKDRTIITLliDPESELIP 

CPn_0943 1077972 1073238 

CT794.1 hypothetical protein 

FFMKSFKFLLPFLSVI LCCGNLLSSPRSRAI SVTES IGMSAVKTLVLSEKAHEFLEGIGY 
GVGASS I LRDWQTQQWL E I ESLLAQNEVM 

CPn_0944 1078503 1078997 

No robust homo log present m Genebank/EMBL as of 11/7/98 
I K IMMHRYFI PLLALL I FS PSLVRAELQPSENRKGGWPTQLSCAEGSQLFCKFEAAYNNA 
I EEGKPG I LVFFSERPTPEFADLTNGSFSLSTP I AKGFNWVLCPGL I S PLDFFHKMDPV 
ILYMGSFLEMFPEVEAVSGPRI^YILIDEQGGAQCQAVLPLETKN 

CPn_0945 1079001 1079660 

CT795 hypothetical protein 

SIFK^ILPSYFGHNFDQLRRHYMRrALSLLSLLMIFPIFGEESRPGSEDGNSNTQEIVG 
SQDTQVCLYHSYEQGLQASRI EGKPLVI WLCNSGDDGOACT IGLS ETCEEVLSVLSGS I 
FSEIjANFVVLVPSGVN PLIYPPIEDPI LAE I VKFKELFKDES F PTGLS 1 1 WGVTP EG PG 
D 1 1 EVS PVS LTVE EEETLPS EQTT EVESTS ELQS EDPAI A 

CPn_0946 1082816 1079745 

glyQ-Glycyl tRNA Synthetase 

G ECQKK KCYTLES FVS EH P LT LQSM I AT I L RFWS EQGCV I HQGYDL EVG AGT FN PATFLR 
ALGPEPYKAAYVEPSRRPQDGRYGVHPNRLQNYHQLQVILKPVPENFLSLYTESLRAIGL 
DLRDH D I R F I HDDWEN PT IGAWGLGWEVWLNGME I TQLT YFQA I G S K PL DT I SGE I TYGI 
ER I AMY LQK K I S I YDVLWN DTLTYGQ ITQ AS EKAWS EYN FD YANT EMWF KH F EDFAEEAL 
RT LKNGLSVP AYDFV I KAS H AFNI LDARGT I SVTERTRY I AR I RQLTRLVADS YVEWRAS 
LNYPLL3LSSTSEPKETSESWPMISSTEDLLLEIGSEELPATFVPIGIQQLESLARQVL 
TDHNIVYEGLEVL^SPRRLALLVKWAPEWQKAFEKKGPMLTSLFSPDCDVSPQGQOFF 
ASOGVDISHYQDLSRHASLArRTVNGSEYLFLLHPEIRLRTADILMOELPLLIQRMKFPK 
KMVWDNSGVEYARPIRWLVALYGEHI LPITLGT I IASRNSFCHRQLDPRKIS ISSPQDYV 
ETLRQACVWSOKERRMI I EQGLRAHSSDT I SAI PLPRL IEEATFLSEH PFVSCGQFSEO 
FCALPKELLIAEMVNHQKYFPTHETSSGAISNFFTWCDNSPNDTI IEGNEKALTPRLTD 
GEFLFKODLCTPLTTFIEKLKSVTYFEALGSLYDKVERLKAHQRVFSTFSSLAASEDLDI 
A rQYC7KADLVSAWNEFPELQG IMGEY'i'LKHANLPTASAVAVGEHLRH ITMGQKLSTIGT 
LLSLLDRLDNLLACFILGLKPTSSHDPYALRROSLEVLTLVGASRLPIDLALILLDRLADH 
FPSTIEEKVWDKSKTtHEILEFIWGRLKTFMGSLEFRKDEIAAVLIDSATKNPIEILDTA 
EALQLLKEEHTEKLAVITTTHNRLKKILSSLKLSMTSSPrEVLGDRESNFKOVLDAFPGF 
PKETCAHAFLEYFLfLADL^NDIODFLrJTVUIANDDGArRNLRISLLLTAMDKFSLCHWE 

CPti_t)'M7 K1S!4!J lf)H4() [ j'! 

pq<-.A ''Jlycerol * V Pho L :phatydy i r t <irt'J L p r . r-.^ 

GSRvrjLr*hr 1 'iTr:JRtaMTprFM[LYLK^Kwr , r J r'rPvvr,!'-r/i J LAL.LArr;Li;rDA[^WA 

PKF^O'/TDLaJKLLDF'MAII U YR fJ r7LTrT'JE' , ['VfJLr , LLLVr I n^\RD: jV [.'TLRTVCAF 
Rf]RW/-^RAJc;KL ) KArt^Ka'-:K['I 1 [L[ J 7M[L'ir;[/;i,U;0N^LKr!- , \:;V , rV-':i iavy-uaj 
G I EYFV/MNKNI 'L.';UR*\KTKP: it'KriHi /JKD 

' Pn_D*MH 1 us'Mfi ■ 1 OH41M / 

-;lqA Myru.)! n 'Jynrha . 

sc::mp tvovAvnTPiVKVtu;u ;[jA , /A::r,::Ki:i,AKQNrjvi'r/LLPiiYi'i,i::Ki , ::.>:;ovi.:;F 

P : ' FY Y L FLG KOOA: : A I ' * Y: ; Y !*'( ; I ;n <T I f T[ S>' '.(J I EI.F'jTT'JVi': IFNNWRI *'*A[ 7VAAAAA 
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YLOEADFAD [ VHLHDWHVGLLAGLLKNPLNPVH3K I VFT IHNFGYRGYC3TQLLAASQID 
DFHLDHYQLr RDPQT3VLMKGALYC3DY ITTV3LTYVQEI INDYSDYELHDAILARNSVF 
IJG I ING EDEDVWNPKTDPALAVQYDA3LL3EPDVLFTKKEENRAVLYEKLGISSDYFPLI 
CVE3R rVEEKGPEFMKEI ILHAMEHSYAFI LIGT3QNEVLLNEFRNLQEXZLASSPNIRLI 
LDFNDPLAR LTYAAADMIC I FSHREACGLTQLIAMRYGTVPLVRKTGGL^ADTVIPGVNGF 
TFF[7TNNFNEFRAML3NAVTTYR0EPDVWIJ^LIESGMLRASGLDAMAKHVVNLYQSLLS 

cr'n_no.i'i 10R58R7 10RM33 

• ■•-[ I .*' ' ' ■ !'r r it ■ i lii 

!■■■! ■'.., .; ..-vi-'Y 1 :r;ri i mi at 'PCT'If:--! i .ki ';['' v 'w w- \ v an* 'tycal'.t 

KKFLSNLESGALSSTVF3LSYEGRI IKALVKDIQYQITTYDVIHLDFEELVEDRPIKLNI 
P I RC I NAVDC IGVKLGGS LRCV I RAVRWCK P KD I VPFLELDVRSVGLSCTRKLSD I K I P 
AGI ETITPLKEVAITVSRR 

CPn_0950 1086470 1087027 

pth-Peptidyl tRNA Hydrolase 

PSLEDNMAKLIVAIGNPRHGYAOTRHNAGFLLADRLVEELQGPPFKPLSKCHAUfTLVES 
S SG PLVF I K PTT FVNLSGKAWLAKKYFNVALSH I LVLADDVNRSFGKLRLC FNGGSGGH 
NGLKS ITASLGSNEYWQLRFGVGRPLEEGVELSNFVLGKFSEEENLQLGSI FVEASTLFT 
EWCSKF 

CPn_0951 1087113 1087457 

rs6-S6 Ribosomal Protein 

-EFLM3KKENQLYEGAYWSVTLSEEAR11KALDKVISGITNYGGEIHKIHDC<3RKKLAYTI 
RGAREGYYY FIYF SVS PGAI T ELWKEYHLNEDLLRFMTLRADSVKEVLEFAS LPE 

CPn_G952 1087469 1087723 

rsl8-S18 Ribosomal Protein 
GENM^PVHNNEHRRKRFNKKCPFVSAGWKTIDYKD 
RFOGVLSQAIKRARHLGLLPFVGED 

CPn_0953 1087727 1088248 

rl9-L9 Ribosomal Protein 

FKGRRMKQQLLL LEDVDGLGRSGD L ITAR PGYVRNYL I PKKKAVIAGAGTLRLQAKLKEQ 
RL I Q AAADKADS ER I AQ AL KD I VLEFQVRVD P DNNMYG SVT I ADI I AEAAKKN I FLVRKN 
FPHAHYAIKNLGKKNI PLKLKEEVTATLL VEVT SDNEYVTVLAQGKQTEENQEG 

CPn_0954 1088259 1088708 

ychB-Predicted Kinase 

GRKVCiTKDIMQYFSPAKlJ4LFLKIWGKRrDNFHELTTLYQAIDFGI^SLKNS>^SLSS 
NVS^ELL S P SNL I WKSL EIFRRETQIHQ P VSWH LNKS I P LQ SG LGGG S SNAAT ALYALNEH 
F^HIPITTLQLWAREIGSDVPFFFLQEQH 

CP%!0955 1088612 1039175 

{frame-shift with 0954) 
RATPNPYSYNNIATLGSRTJRXRCSFFFSSCT 
GEP/SEKAYQSIXPQDYSTGNHNACFYGENDLEK^ 

IXSGSGATLFVCYLEELEQDSKVSSQIHSLIKC/TQGIPVSRLYREPHWYSIJCQSTYKNSP 
LSplQPQI 

CE|ff0956 1089545 1090909 

CT&&5 hypothetical protein 

LV&ESM I LPPYSYSLKIGAAVLFFCS ILHTFLTPWLYTLCQSYEHKKLVFPECWKRYARL 
SE^ERILSRVEIVFFLWAVPLFFWFLYTEGYRISMAYFT^SFNYGFAVFIMVILILLE5RP 
IV^Y^AELVLSSIAKLGKTSPKSWWWTIJ^ 

SRRFAYATMGLLFSNISIGGLTSYVSSRALFLIFPALKWEHSFFLSHFAWKAIVAILIST 
TIYYFIFRKEFKKFPDIPSDKDPSVEKVPWWI ICVNIIFVGSIILSRSTPLFMGALLLFY 
LGi&^FTIFYQDPINLSKVCYVGLFYAGLVVFGDL^EWWV^ 

I FLDNALVNYLVHKLS VATDCYHYLWAGCMAAGGLTLVSNI PNI VGYL ILRSAFPSST I 
Hb|afcFLGALGPSIISLGVFWLLKNVPEFLYCFFR 

CPbj0957 1093812 1090963 

icfe^Ptr-Insulinase family/Protease III 

K il^RNCKMFWKLLCPILICTSLS ITSCEQQFKWPNQC PLQVSTPAAADQKI EKI ICSN 
GLE^IISDPNLPTSGAALLVKTG^ADPEEYPGMAHFTEHCVFLGNEKYPEVSGFPGFL 
SEfciyGVHNAFTYPNKTVFVFSVEH^ 

AKKsS DG RRVHR IQQLVAPQGH PCARFGCGNASTLT PVTT EKMAEWFKLHYS P ENMCAI A 
YT^PLSKAKKQFSKIFSQIPRSKNYERQEPFLPSGDTSSLKNLYINQAIQPTSNLEIYW 
HIYESSHPIPLGCYKAIjAEVLRNESKNSLVSLLKNEQLITOLEJV^FFRSSLNTGEFYISY 

eltekgdkhysqvidstfqylryiqehgipnytleeistinalnycyssksplfdllckq 
ivslgnedlstypyhslvypkyssedesallnlvsdpeqarfvlssknsehweeatqlhd 
pifdmtyyvkaldgvqdygkvqslkp ialpkpnlfi pkevtlpgvhllkkqefpfapals 
yqddkltlyhcedhyytapklssq i rirs pq i srss poflvatelyclavndqllreyyp 
at q ag l s ft s alggdg idlrvsgytttvpallns i ltslpnleisyetflvykkqllely 
qgall^jcpvrsgldelasqvmkf^s^m , klsaj j eklsfsefqafasnlfnsvhlevmvl 

GNLSEQQKKDYLEMLOVFTASRSSHATKPFYYELQSQEISEIHHDYPLTANGMLLLLQDK 
33 PS IQGKVCAEMLFEWLHHITFEELRTQQQLGYMVGARYREFASR PFGFLYIRSDAYSP 
EELLAKTSLFLNKVSASPEKFGISQEKFANIRKAYINKILEPEHSLE3MMNSALFSLAFER 
PFVE FST PDLK I A I AET LTYE EF LKYCQC F LS N ELGTQTS VY I RGTQKT S 

CPn_0958 1094803 1093793 

plsB -Glycerol -3- P Acy It ransf erase 

rYRAIYMQFSRYLRYAFDNQYLPEPLYQKFSVFHQNYIDAATKKAAAI>2AEVLCLQWVKV 
r I EDLKNPF I FPPYHKKIRAP IDLFRLS I DFFSLVI DDKNSR ILNLHRLKEI EEYI ARGD 
NWLLANHQTECDPQLKYYALGKTH PELMENM I FVAGDRVTSDPLARPFSMGCDLLC IYS 
KRHrATPPELRESKLLHNQKSMQILKTLLNEGGKFrYVAPAGGRDRKNAEGRLYPSEFSP 
E3 rEVFRLLAKASNQTTHFYPFALKTYDI LPPPPKI ENAIGEQRAIFFAPVFFNFGAELF 
FDALC::KEEL I HCDKHAQRTLRAEKVF5 IVKNLYEEL 

<:t J Ti_0'J l i'i lU'tbJ76 10 1 M79 C > 

f.-.itE-AxL.i L FiUiment Protein 

AO* JYG I ' jTRKVMENEI LLN I ESK EI R YAM LKNGOLFDLT I ERKKVPQLKGN I YRGRVTN I 
LRNTQ^AF :NTDERENc;FIHI3D[LENSKKFE0MFDMDVDALPEEAGEAPLL33EEAPEE 
RF[.KLLt:;PV[,VOWKE[' [Cj^KGARLTSN E I PGRYLVLLPNSPHRGV53RKIEDPHMREQL 
KoL I P..' I f-'RM 1'<J[M ( j L 1 1 * RT A3TTA3T EAL I NEAHDLLLTWKT ILCKFYSTEQPCLLY5ET 
I; [ LKKA7 t Tt: I DKNYKRLL I DDYATYQKCKMMLKKY" PDA3 1 KI EYYRDSI PMFERFN EE 
KE-IDKATHHK [WLilliJGUYLFFDKTCAMHT I DVN SG P 3TQ L C3G V E ETLVQ I ML EAAEE LA 
l'Ol.ftr.I'MVrx;i.V [ I OF [DMK.iRKNORRVLERLKEHMKYDAARCTrLSMSEFOLVEMTRQR 
NK R: ICWgTLFTLCPYr^UNA 1 1 KTPFSW I E I ERDLKKVINHKEH'iHLCLWHPEI ASYM 
KQWirjCiNI-^IlNLAKOt.KAKt^TNT'JD'TVMLMHYOFF^LrTGEGCDL 



CPn_0i*»0 lO'^'w-i L-- 1 1 7 102 

(T8 09 hypo Cher toil prnrein 

3L3LV3YLSNPCKALVLG3KGF3MDCVDNLKLY t FRLKLFVIDTER I ~Y3 I JPEY IREKGE 
EELLNSPIEV^LGRISSDCVILJLSLKTC'LGLCCPVCNNFFSHGVCLPDLORVISHDE 
VGSGVFIXRPLIRQELI-^E3DCFEECSGQGCPERKN ILKFLEDRKKHEGN3PFEYL 

CPn_09bL 1097 10b 10'>7217 

rll2-L32 Ribosomal Prorem 



CPn_0962 1097301 1093275 

plsX- FA/ Phospholipid Synthesis Protein 

I LSDFM EVQ IG IDLMGGDH S P L WWQVLVDVL KSQS ST I P F A FTLF ASEE I RKQ I Q EE F I 
SDLPQEKFPKI ISAENFVAMEDS PLAAI RKK3SSMALGLDYLQEDKLDAFISTGNTGALV 
TLARAK I PL FPAVSRPALLVCVPTMRGHAVI LDVGAN I S VK P EEMVGFARMG LAYRQCLG 
DSKIPT IGLLN IGS EERKGTEAHRCT FRMLRETFG EAFLGN I ESG A VFDGAAD I WTDG F 
TGNI FLKTAEGVFEFU3RI LGDKLEAD rQRRLDYTFYPGSWCGLSKLVIKCHGKACGS S 
LFHGILGS INLAQARLCKRI LSNLI 

CPn_0963 1098374 -1103224 

pmp_21- Putative Outer Membrane Protein 

TPLRFKVAMVAKKTVRSYRSSFSHSVIVAILSAGIAFEAHSLHSSELDUGVFNKQFEEKS 
AHVEEA0/TSVLKGSDPVNPSQKESEKVLYTQVPLTQGS3GESLDLADANFLEHFQHLFEE 
TTVFGIDQKLVWSDIJTTRNFSQPTQEPITrS^VSEKISSITrKENRKDLETEDPSKKSGLK 
EVSSDLPKSPETAVAAISEDLEISENISARDPLOGLAFFYKNTSSQSISEKDSSFQGIIF 
SGSGANSGLGFENLKAPKSGAAVYSDRDI VFENLVKGLS F I SCESLEDGSAAGVNIWTH 
C G DVT LTDC ATG LDLEALRLVKDFSRGGAVFT ARNH EVQNNLAGG I LSWGNKGA I WEK 
NS AEKSNGGAFACGSFVYSNNEKTALWKENQALSGGA I S S ASD I D I OGNCSA I EFSGNQS 
L I ALGEH IGLTD FVGGGAIJ^CGTLTLRNNAWQCVKNTS KTHGGA I LAGTVDLNET I S E 
VAFKQNTAALTGGALSANDKVIIANNFGEILFEONEVRNHGGAIYCGCRSNPKLEOKDSG 
EN INI IGNSGAITFLKNKASVIjEVMTQAEDYAGGGALWG HNVLLDSNSGNIQF IGN igg s 
T FW IG EYVGGGA I LST DRVT I S NNSGDWF KGNKGQCLAQKYVAPQ ETA PVES DAS STNK 
DEKSLNACSHGDHYPPKTVEEEVPPS LLEEHPWSSTD I RGGGAI LAQH I F ITDNTGNLR 
FSGMiGGGE ESSTA^DLAIVGGGALLSTNEVNVCSNQNVVF S D^A^^ SNGCDSGGAI LAKK 
VDISANHSVEFVSNGSGKFGGAVCALNESWITD^SAVSFSKN^ 

TICGNCGNIAFKENFVFGSENQRSGGGAI IANSSVNIQDNAGDILFVSNSTGSYGGAIFV' 

G S LVAS EG SNPRTLT I TGNSGD I L FAKNS TQT AAS LS EKD S FGGGA I YTQNL K I VKNAGN 

VSFYGNRAPSGAGVQIAIXSGTVCLEAFGGDILFEG^INFDGSFNAIHLGGNDSKIVFXSA 

VQDKNIIFQDAITYEENTIRGLPDKDVSPLSAPSLIFNSKPQDDSAQHHEGTIRFSRGVS 

KIPQIAAIQEGTLALSQMAIXWLAGL^QETGSSrVLSAGSILRIF 

EETLVSAGVQI>MSSPTPNKDKAVI]TPVLADIISITV^ 

GTKIJ^SNAIDIJCIIDPTNVGYENHALLSSHKDIPLISLKTAEG^fI^ 

VS L P S I T PATYG HTGVWS E S KMEDGRL WGWQ PTGY KLN P EKQGAL.VLNNLW SHYTDLRA 

LKQEIFAHHTIAQRMEIXiFSTNVWSGIjGWEDCQNIGEFDGFKHHLTGYAI^L^^ 

DFL IGGCFSQFFGKTESQSYKAKNDVKSYMGAAYAG I LAG PWL I KG AFVYGN INNDLTTD 

YGTLG I STGSWIGKGF IAGTS IDYRYIVNPRRF I S AIVSTWPFVEAEYVRI DLPEISEQ 

GKEVRTFQKTRFENVAIPFGFALEHAYSRGSRA£WSVQUVYWDVYRKGPVSLITUCDA 

AYSWKSYGVDI PCKAWKARLSNOTEV^SYLSTYLAFNYEWREDLI AYT)F^KX3I RI IF 

CPn_0964 1104812 1103301 

No robust homo log present in Genebank/EMBL as of 11/7/98 
QS ILES I IKYFYL I HNSKMHMSNP I S LFS PAEL I AKYNL I PKTS P I YPRRTEL 1 1 LEENA 
CQTRLTNVAQVLHPSSLFSMSKKILNPCGCSGGPLCWVI LNILAFI ITSVLFI ILLPVNL 
IVAG^LFMPLPPKKIVEDL^EPTTEETNEVIQPFIFALQADLFEDNKLRSFKIVEQSVG 
KAPLPNPFIJ^VAISPQESQEAMRKIPDLCSQU<KVLKSLGVLTPEWKHMLKYFEGI^ 
EHDSNPDKKTFP ILIKLL IEALTGKSSLPKTPSTKEKMQAALFIASSCKTCKPTWGEVrT 
RSLNRLYS I ANEGDNQLLI WVQEFKERELMS IQDGDDAEEYRFAAQQHGERYTEAIEQVL 
RN ES AAKLQWHV I NTMK FFHGKNLG LVTE HLQDTLG ALT L RQTTVDTH QGR EDADL SAAL 
FUvH<YLlNSGNQLVNSVFKSMQKADPETKALIREFALDILYAS LRLPGTSAHTEVFSTLLM 
DPETYEPNKAC IAYLLYVLKI I EL 

CPn_0965 1106769 1104925 

IpxB-Lipid A Di saccharide Synthase 

KGFSFSKVGLNMI PSGLVYLLYPLGFLASLFFGSAFS IQWWLSKKRKEVYAPRSFWILSS 
IGATLMIVHGT IQSQFPVTVLHVINL I IYLRNLNITSSRP I SFRATLVLMALSWFVTLP 
FLYVNMEMiASPNIFHLPLPPAQLSWHLIGCI/alAIFSGRFLIGWFYIESNNTKDFPLLF 
WK IG LLGGLLALVYF I R IGDP I N ILCYGCGLFPS I ANLRLFYKEQRST P YLDT HC F LSAG 
EASGDI LGGKLI OS IKSLY PNI RFWGVGG PAMRQEGLQP ILNMEEFQVSGFAEVLGSLFR 
LYRNYRK ILKT I LKHKPATLI F I DFPDFHLLLI KKLRKHGYRGK I IHYVC PS I WAWRPKR 
KRILEOHLD^LLILPFEEGLFKOTSLETVYLGHPLVEEISDYKEQASWKEKFLNSDRPI 
VAAFPGSRRGDISRWLRIQVOAFLNSSLSOTHQFVVSSSSAKYDEIIEDTLKAEGCQHSQ 
1 1 PMNFRYELWRSCESCALAKCGTIVLETALNQTPTIVMCRLRPFDTFLAKYIFKILLPAY 
SLPNIIMNSVIFPEFIGGKKDFHPEEIATALDLLNQHGSKEKOKEDCRKLCKVMTTGQIA 
SEEFLKRIFDTLPAV 

CPn_0966 1108055 1106748 

pcnB_2-PolyA Polymerase 

LLITIIMVCENNILSGRGLELLKKKSNITLTPTIYSVSNHNIKLKDFSPHALSVIKTLRK 
AGYIAYIVGGCIRDLLLNTTPKDFDI3TSAKPEEIKAIFKNCILVGKRFRLAHIRFSKQI 
IEVSTFRSGSTDEDVL ITKDNLWGTPEEDVLRRDFTINGLFYDPEHEEI IDYTGGVNDLR 
NRYLRTIGDPFTRFKQDFVRMLRLLKILSRSPFTVETQTQEALIACROELIKSSQARVFE 
ELIKMLNSGAAKNFFQLLIENHLLEILFPYMDKAFRLNPALEEQTATYLKALDDKILKKE 
AEYDRHQLMAIFLFPLVNFNVRYKHQKHPYLSLT3VFDYIKNFLEQFFADSFTSCSKKNF 
I LTALI LQMOYRLTPL I FTKKALFFNKKLLHHTRFLEALGLLErRS I VYPKLDKVYVAWI 
RHHQTLKCKKDSHSQK 

CPn_0967 L10H431 110^3^ 

mrr>A/pgm-?tio;-phoo i urnmutanp 

FT AY K FA F I C AC RC EK I R R I C I DF R R NMQOS V f> Y L FGT CXJ V RG RAN f ' E PMTV ETTVL LG K 
AVARVLREGRi'^'IKHRVV'W'KLTrPLljC/YMFF-NAI, [AGLM'JM r j I ETTI.VLGP [ PTFGVAF TTR 
AYRADAvlIME;'A::MNPYRPNt; t K I F.SLEGFK I >OVI.EQP I ETMV.SFv\DFf JPLPEDHAVGK 
NKRVIDAMCiRWEFVK Vr^PKf;PT[.K^LKrVLLX-AHC;A r ;YKVAI j :;VFF;Rr,nAFVICYGCE 

pTGrNrNCHL\^-\LF!'o\ ioKAvrniyAHLGiAt.r/;D(Jtjp I rMVDKKnn[vrn;DM[L. / >rCA 

ODLKKR'JALPI INKW-Vl' I M'I'N! f,V\,K ^LKCUAJI/jVFT^PWDr'IIVr.HAMIJ-'MEVTUTGEO 
SGUM [ FLDYNTE\ I [\ r :* M aj'/LR I M f r.:*F" vMI / ^Or^TAf' f VK^POTI . ENVAVRPK [ PE^ET 
[ PL f CRTLRDVOHAl k '.1 :\ )H 1 1 J t UY: It ITRN 1 1 I'VMVRf JHKKHOVrX'I AKAl .A [TV I DAKLr, 
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DRMCCU FTJYLCNQDGVSIVLBGLAKLEYRGYDSACLAAVVEQELF IRKTVCRVQELSNLF 
QEREl FTA3V ICHTRWATHCVPTE INAHPHVDEGRSCAVVHNGI I ENFKELRRELTAQG I 
3FASDTDSEI rVQLFSLYYQESQDLVFSFCQTLAOLRGSVACALIHKDHPHT ILCASQES 
PL I LG LGKEET F T AS DS RA FF KYTRH SQ ALASG EFA I VSQGKEPEVYNLELKKIHKDVRQ 
ITCSEDASDKSC YCYYMLKE I YDQPEVLEGL I QKHMDEEGH I LSEFL3DVP I KSFKE IT I 
VACGG3 YH AGYLAKYI I ESLVSTPVH I EVASEFRYR RPY rGKDTLG I LI SOSGETAITTLA 
ALKELRRRN IAYLLC ICNVPESAIALGVDHCLFLEACVE IGVATTKAFTSQLLLLVFLGL 
K L ANVHCALTH A E£T 3 FOQG LO S L PD LCQK L LM I EG LH SWAQ PY SY EDK FL F LG RR LMY P 
V7MK.V\i <K I iKii i A l ii AtiA iT- > J*K\V i V . ' , " : '. \< ' '"JDUI f>' > M L> ,NMME\»K 
■-.i': i Ai [V [ A [ APi:. :i J ' LI AA'/.'E ' r ' ■ ' ' . 7 1 r 7 A '" ; /'VVMA -V- ' , \ v ,MC T IV 
PRNLAKSVTVE 

CPn_0969 1111803 1112999 

tyrP_l -Tyrosine Transport_l 

VYVMSN KVLGGS LLI AGSA IGAGVLAVPVLTAKGG FF PATFLY rVSWLFSMASGLCLLEV 
MTWMKESKNPVNMLSMAES ILGHVGKIS I CLVYLFL FYS LL I AY FC EGGNI LCRVFNCQN 
LG I SWI RHLGPLG FAILMG P 1 1 MAGTKV I DYCNRFFMFGLTVAFG I FCALGFLKIQPSFL 
VRSSWLTT I NAF PVFFLAFGFQS I 1 PTLYYYMDKKVGDVKKAILIGTLI PLVLYVLWEW 
VLGAVSLPII^QAKIGGYTAVEALKQAHRSWAF^IAGELFGFFALVSSFVGVAI^VMDFL 
ADGLKWNKKSH PFS I FFLT F 1 1 PLAWAVC YP E IVLTCLKYAGGFGAAVI IGVFPTLIWK 
GRYGKQHHREKQLVPGGKFALFLMFLLIVINWS IYHEL 

CPn_0970 1113452 1114648 

tyrP_2 -Tyrosine Transport_2 

\/YVMSNKVL<XSLLIAGSAIGAGVLAVPVLTAKGGFFPATFLYIVSWLFSM^ 
MTWMKESKNPVNMLSMAES ILGHVGK I S ICLVYLFLFYSLLIAYFCEGGNILCRVFNCQN 
LG ISWI RHLGPLGFAI LMGPI IMAGTKVr DYCNRFFMFGLTVAFG I FCALGFLKIQPSFL 
VRSSWLTT INAF PVFFLAFGFQS 1 1 PTLYYYMDKKVGDVKKAILIGTLI PLVLYVLWEW 
VLGAVSLPI LSQ AK IGGYTAVEALKQAHRSWAFY IAGELFGFFALVSSFVGVALGVMDFL 
ADGLKWNKKSHPFS IFFLTFI IPLAWAVCYPE IVLTCLKYAGGFGAAVI IGVFPTLIVWK 
GRYGKQHHREKQLVPGGKFALFLMFLLIVINWS IYHEL 

CPn_0971 1114693 1115415 

yccA- Transport Permease 

EGSMGLYDRDYIQDSRVC^TFASRWGV^AGLIVTSCVALGLYFSGLYRSLFSFWWVWC 

F ATLGVSFF INS K I QTL SVS AVGGLFLLY S TL EGMF FGTI^PVYAAQ YGGGVIWAAFGS A 

ALVFGrJ^AVYGAFTKSDLTKISKI^f^FALIGLLLVTLVFAWSMFVS^[PLIYIXICT^L 

VIFVGLTAADAQAIRRISSTIGDNOTLSYKLSLMFAL3<MYCNVIMV^ 

D 

CPn^972 1116377 1115430 

ft$YjCell Division Protein FtsY 

RC rNNSLLFPSYLVSFLLLQLTLLLAMFK FFRNKLQSLFKKNI SLDLI EDAESLFYEADF 
GTg&FEELCARLRRTKKADAST I KDLI TVLLRES LEGLPSQASQS SQTRP I VSLLLGTNG 
SGKTTTAAKLAHYYKERS ESVMLVATDTFRAAGMDQ ARLWANELGCGFVSGQ PGGDAAA I 
AFSGlQSAI ARGYSRVI IDTSGRLHVHGNLMKELSKIVSVCGKALEGAPHEI FMTVDSTL 
GNNAIEQVRVFHDVVPLSGLIFTKVDGSAKGGTLFQIAKRLKI PTKFIGYGESLKDLNEF 
DLCCFLNKLFPEVEKI 

CPnQ5973 1116346 1117527 

"sucS-Succinyl-CoA Synthetase, Beta" 
EGKSKELFMHI^EYQAKDLLASYDVPIPPYWWSSEEF/^F,^ 

GRQKfeVI VAKSSAG ILQAVAKLLGMHFTSNQTADGFLPVEKVLI SPLVAIQREYYVAV 
IMDR^RCPVLMLSKAGGMDIEEVAHSSPEQILTLPLTSYGHIYSYQLRQATKFMEWEGE 
VMflt^QLIKKLAKCFYENTWSIiEINPLVLTLEGE 

YDPSQENVRDVLAKQ IGLS Y I ALSGN IGC IVNGAGLAMSTLD I LKLHGGNAANFLDVGGG 
ASCKQ IQEAVSLVLSDESVKVLFINI FGG IMDCSWASGLVAVMETRDQWPTVIRLEGT 
NVEJLSKE I VQQSGI PCQFVS SMEEGARRAVELSM 

CPriUQ974 1117523 1113422 

"supD-Succinyl-CoA Synthetase, Alpha" 

VCRE&RYMFHSLSKNTPI ITQG ITGKAGSFHTEQCLAYGTNFVGGVTPGKGGTLWLDLPV 
YDSyLEAKQATGCRATMIFVPPPYAAEAILEAEEAGIELIVC ITEG I PVRDMLEVARVMD 
NST£&L I G PNC PG 1 1 K PG ECK IG IMPGY I HL PGN IG WS RSGTLTYEAVWQLTQLK IGQS 
IGVGXGGDPIjNGTSFIDVLQALEEDPYTELIL^IIGEIGGSAEEEAAAWIQAJHCT 
IAG^&PKGKRMGHAGAI ISGNSGDAKSK IQVLRESGVTWES pah IGKTVDAVLRAKEL 

CPn!iH975 1119033 1119637 

No robust homolog present in Genebank/EMBL as of 11/7/98 
G I EEQVALS IAIKILKI I LALI L FPLVLLAWV I RYQ LHANFHC SWP F PG F SVNQAYKC S 
EAK I EEMLDLLDLETLEWS S RC LRQDMTFANR LEEEL IQ ELRVS ET EEL I S LGGKRNLVR 
LLLTHFFNPPKRSRVESVGHEWFPVFDRLKREEEI IGDGPITRSNEELWALLDHGTARG 
IHKTLWFSIFFKYLTQIELF 

CPn_0976 1120079 1121185 

No robust homolog present m Genebank/EMBL as of 11/7/ 98 

I LMLVYCFDPSVPTSPEHRLMAALDRWFFLGGHRAR ILTLEGNHYRAFQENMS ISTVEK I 

LKLISYLLIPIVLIALLIRCFLHSRFKCNWKCDSLSDARVPHDVQPFNDFQLFNNQERLN 

EWKNRRYVSG I DVLMVPVDYLRSQFPGFKE I PEAI RCENYV5DGC/FSEESKTSYLRAMLT 

DIVGYILSLDETYV/TNVILKIRAMCITFESFPGKEADPNYSPRVTHHYFDESWKA1J\RHV 

LGECNMVNRLDEALI RTEKPGKEGEC ITKQFLKDYC KKHLEVMSCPDF I ESLVDEK IREF 

RC PS I LNS AVC DV I DR KCQ EH L LKAI INEANRRLPGMKNSSFTMRGNQVLFYTIFSPPKL 

PPAASSVYF 

CPn_0977 1121329 1122402 

No -robust homolog present in Genebank/EMBL as of 11/7/98 
LYINQFANILKSSFLMEWSFSPSVRTSFQHRVRV^LDNWFFIjGGRRLKVVSLDSCNSGQ 
ACEEYVPI5TTEKVLKILSYLLIPIVI IALLIRYLLHSNFTAKVSQKFWLKTLQLGIDIK 
:'JFrLPGSMVNTMDSATLFKAIRLECKRVDVEYHRLHSGDKWFYIPAQKLPDDLRLTHWL 
PEKETRKTEWRHMLAHVMGYLTSC<JKERLQQW0DSRS3TSLGAEKVLQYRFIDHPQSO 
CEFQRLLNEN rTTKGCEDKEVVQSDLFDMAFQCWWPQF I SV EQCPTFSEELVHEMSQKLD 
LDC [ Y P E DD EF EQ K F LNT L LKA VLH HCF EG 1 3 VASMG V I FL ECPDSLALQI PFLRNQK 

CA'nJj'jlK L L22t.S i u:j^r. 

No robu:;r. homoUjq pr<«;ent m Gf.-m>l\ink/EMDL -y, ot ll/7/ f JH 
K Y F FM E VY 3 FI ( P AVR' r r TQH R VMAA L DAW F F L( k T H R L F W: * L D f JCWA YQ ELV 3 1 3TT 
kkvlkll^yllvpive eall ercllhcnfr IDVCKEPWLK [RELGIDEESCKLP3SYVNQ 
VS:-;F EWFKKDK^KHPR IDVDYHTLH3KDWVVFPEVFQK I PKT r ;PF3YWF3QKETRKRDYV 
HNMl.DllV U !Yr/P:;W JflfcWLyY ISKrilYQSAT^LPrERVLQY^LTDNQELUGEVQRLLNEE 
ilATK-'J^CDKEVLI^Tf 1VSD E £CQCWWPKFLEVIO::FAr T EELVEEV.3GKLNLDFLCLEKAN 
TI iIXjl\LRN!M.L.RAWHf|{ V>F/1VD TKKVGAGL I [ YTEA f^r.O E PT3R:> 
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CPn_0979 lL2J'ni LL2 £ 3443 

htrA-DO Serine Protease 

G I DMITKQLRSWLAVLVGSSLLALPLGGCAVGKKESRVS ELPQDVLLKE ISGGFSKVATK 
ATPAWYIE3FPKSQAVTHPSPGRRGPYENPFDYFNDEFFNRFFGLPSQREKPQSKEAVR 
GTGFLVSPDG^/IVTNNHWEDTGKIHVTLHLX^QKYPATVIGLDPKTDLAVIKIKSQNLPY 
LSFGNSDHLKVGDWAIAIGNPFGLQATwTVG'/ 1 SAKGRNQLH IADFEDF IQTDAAINPGN 
SGGPLLN I DGOV TGVNTA IV3GSGOY IGIGFAI P.^LMANR 1 1 DOL I RDGQVTRGFLGVTL 

TKG I L 1 1 S V EPG SVAASSG I APGQ L I LA VNRQ KVS S I EDLNRT LKDSNN EN I LLMVSQGD 
VIRFIALKPEE 

CPn_0980 1126968 1125504 

♦similarity to Saccharomyces serevisiae nypotnetical 52.9KD 
protein 

FVMLNH AKKHAKPYVL I FFSTKDKLSYCD 1 1 FNNCSGKPMNLDSKHFD I NS ANFLEEFAK 
FISFPS ISADSDHLQDCENCAHFLVDHVNKIFDVELWETPGHPPI IYASYKSEDPLSPTL 
MLYNffYDVQPAQLSDGWKGDPFILREENG^YARGASDNKGCCFYTLKAIXJHYYES 
PLN 1 1 WLI EGEEESG S LAL FTWLEKKKEALRADYLL I VDGGF LS EKH PYVS I GARG I VSM 
KI SLEEGNKDMHSGVLGG I AYNTNRALSE ILS SLHH PDNS IA I EGFYDDLALPSDSDRPD 
LPKSDTLRECEENLGFRPQGYEASYSPEESALRPTVE ING I SGGYTGPGFKTVI PYRATA 
YLSCRLVPNQDPDKAAHQVIHHLKQQVPSSLKFSYE I LPGGSRGWRSSANLP IVKVLQEI 
YSDLYNEECLRLVMPATIPIGPLLGEAAQTS P I ICGTSYLSDDIHAAEEHFSMDQLKKGF 
LS I CQLLDKLPK IKE 

CPn_0981 1127019 1129952 

Zinc Metalloprotease (insulmase camily} 

VTESMKAGDTYRNFIIKSCKDLPEIESKLLEAEHKPTGASIMMI\/NNDEENVFNICFRTC 
PQTSNGVAHVLEHMVLCGSENYPVRDPFFSMTRRSLNTF INAFTGPDFTCYPAASQ I PED 
FYNLLSVYIDAVF^PLLTKQSFLQEAWRYEFNSENHLCYTGVVF^EMKGAMMS 
ALNAAIFPSVTYGVNSGGEPREIVTLSHEDVRAFHQSQYSINRCLFYFYGNIKPSRHLDF 
LEEKLLRQATKLEKQAVSVPLQKRFKEPVRNI LTYPVDHQ EE DKVLFG I SWLTCS I LEQQ 
ELLALHVLEI ILMGTDASPLKSRLLKSGFCKQTEMS IENDIREI PMTLVCKGCSPAGAQK 
LEALIFASLEEI IREG ISENIVEGAVHQLELSRKEITGYSLPYGLSLFFRSGLLKQHGGS 
AEDGIJIIHSLFSELRNSLKNSDYLAKLIRKYFLJ3NPHFARVILLPDTELVAKDNKDEC^ 
LLSVSEKLTDENKEKIQQNVHELTESQEQKEDLNGILPNLALDKVPTSGKEFPLIKK 
QGEVI^HECFTTOIWIDWIJDIPPLSGEFXPWIJ^ 

HTGGVDVSYDFSPHANKNSFLSPSVS IRGKALSSKSEKLCGIVSDMLTSVDFTDIPRIRE 
LLMQHNEALTNSVRNS PMSYAVSMACSGNSITGAMSYLTTGLPYVKKIRELTKNFDQNID 
EAWI LQRLYTKCFSG KRQ I VI SG SAHNYQQLKDNKFYGLLDYL I V I P E PWEN PS I NLYV 
TSRGLH I PARAAFNALAF P IGDI AYDHPDAAALTVAAE I LDNWLHTK I REQGGAYGSGA 
AANI^RGSFYCYSYRDPEIATTYKTFLKGVSEIASGN 

GSRASVAFYRLKSGRIPVLRQAFRRSVLEVTKEHICMVMDKYLESTVQETTLISFAGEEM 
LRNNVLT LDKD F P IVP AI 

CPn_0982 1131215 1129962 

yigN family 

KKELASVMNLFVSLACLIJLjSGCWFLGVWSSSLYARKKRAFL 

^SRHQEQLIEDFShlRLALSSHKLIKDMKEEAQNYFGDTSKSFQSILSPICTTLTTFKQS 

LETFF/TKHAFJ)RGRXJ<EQISQLlAVEKKLEHETHVLTDIIJ<HPGSRGRVrc 

AGMLJCYCDYDSQTTSAQGAFRADIIIRLPQDRCLIIDAKAPISDSYFSVEEIDKGDLVDK 

IKEHIKTLKSKSYWEKFHQSPEYVILFLPGESLFNDAIRLAPELMEIGASSNVILSSPLT 

IXALLKTIAYMWKQENLQKQIQEVSLLGKEI^RRLQVVFTHFQKIG 

SSFQYRVLPTLRKFEGLETSSSHQ IEEPTP I ESLATS FPHTCDIDTNLAVIESLEKQD 

CPn_0983 1132045 1131206 

pssA-Glycerol -Serine Phosphatidyl ransf erase 

KNPLCYEQKKLWQIDMAGLDLEARGKRRWTPNAITAFGLCCGLFI I FKSVLRTSSSVEL 
FHRLQGLSLLLISAMIADFSDGAIARIMKAESAFGAQFDSLSDAVTFGIAPPLIAIKSLD 
GIYVGNFFSSLLLITS I IYSLCGVLRLVRYNLFSQKTVDVSKPYCFIGLPI PAAAASIVS 
LALFLASDFFPDLPAQLRVGLLS FALLF IGGLM ISPWKFPGVKHFRFNVSSFLLWTIGL 
AACLFFSGLVDHFVEVFFLVSWLYTLVGFP IFS 1 1 YRKKS 

CPn_0984 1132370 1135510 

"nrdA-Ribonucleoside Reductase, Large Chain" 

GKVMVEVEEKHYTIVKRNGMFVPFNQDRI FQALEAAFRDTRSLETSSPLPKDLEES IAQ I 
TH KWKEVLAK I S EGQWTVER IQDLVESQ LY I SGLQDVARD Y I VYRDQRKAERGNSSS I 
IAIIRRDGGSAKFTJPMKISAALEKAFRATLQINGMTPPATLSEINDLTLRIVEDVLSLHG 
EEAINLEE IQD IVEKQI^IVAGYYDVAKNYILYREARARARANKDQDGQEEFVPQEETYVV 
QK EDGTTYLLRKTDLEKRFSWACKRF PKTTDSQLLADMAFMNLYSG IKEDEVTTAC IMAA 
RANIEREPDYAFIAAELLTSSLYEETLGCSSQDPNLSEIHKKHFKEYILNGEEYRLNPQL 
KDYD LDALS EVLDLS RDQQ F SYMGVQNLYDRY FNLH EG R RL ET AQ I FWMRVSMG LALNEG 
EQKNFWAITFYNLLSTFRYTPATPTLFNSGMRHSQLSSC YLSTVKDDLSH IYKVI SDNAL 
LS KWAGG IGNDWTDVRATG AVI KGTNGKSQGVI PFI KVANDTA IAVNQGGKRKGAMCVYL 
ENWHLDYEDFLELRKNTGDERRRTHDINTASWIPDLFFKPLEKKGMWTLFSPDDVPGLHE 
AYGLEFEKLYEEYERKVESGEIRLYKKVEAEVLWRKMLSMLYETGHPWTTFKDPSNIRSN 
Q DHVG WRCSNLCT E I LLNCS ES ETA VCN LG SIN LV EH I RNDKLD E EKL KET ISIAIRIL 
DNV I DLN F Y PT P EAKQANLTH RAVG LGVMG FQ DVL Y ELN 1 3 Y ASQ EAVE FS DEC S E 1 1 AY 
YA I LA33LLAKERGTYASYSG5KWDRGYLPLDTI ELLKETRGEHNVLVDTSSKKDWTPVR 
DTIQKYGMRNSQVMAIAPTATISNI IGVTQS r EPMYKHLF-7KSNLSGEFTI PNTYLIKKL 
KELGLWDAEMLDDLKYFDGSLLEIERIPNHLKKLFLTAFEIEPEWI IECTSRRQKWIDMG 
VS LN LY LA E PDG K KLS NMY LTAWK KG LKTT YY LR SQ AAT 3 V E KS F T D I N K RG I Q P RWMKN 
KSAST3 IWERKTT PVCSMEEGCESCQ 

CPn_0935 1135432 113^571 

"ncdE-Pibonuc Leosidf* Reducr.inp, :"m<jll ch^in" 

rSVHKYCGRKKNNPRLFWCRRLRIL3 ITEKR^AKMEADI LI/]KLKRVE\'SKKGLVNCNQV 
DVNQLVPIKYKWAWEHYLNCCANNWLPTEVrNb\PDrELWr':DEL^EDERRVILLNLGFFS 
TAE:>LVGNNIVLAIFKIirTNPCARQYLLRQAFEEAVMTMTFLYI('E:;u;LDEGEVFNAYN 
RRA;) EPAKDDFOMTLTVDVI.DPMF.IVOTJECU/ F I KN LV ; Y Y [ IMI\M FFYnGFVMILS 
FURQNFMT^ [t;EQY(.'Y I LRDRT EHLNf Y ! I DL LN f J I KEENF'C'/Wri'l'.l^nLIVALtEKAVE 
r.EtEYAKDt , LPRf:ru;i.E<-: , 'MrrDYVPH :APRPLnR KILKP I YHSRNPFIVM'JETMDLNK 

ekn r-r ett p vt n yot a« : n i , : ; w 

' pn.Lj'in*. n / 1.> in/ vf, 

yijilH pf ,l dLCti-<i r KNA M- -r | iy |, • 

f- 1 .LFMKPQDE.wPPrLWKf IKPf 'f I tj[ m ;vr . /VPK'M f\ V.\\\jtl\- '/[" JY! Kjia-TON! IT*: I Al'LLC 
■:<;Nt;LWWAOA<jKDPnV[.W | AVi f.'PI- I4'7PK i w;KM INI 10 PJNLR I Vi'i itaitfi-vyyvp 
IX.-»Kr,Ul'LWNFPDPWr i KMRHRKi(PM,( J |--[-VOl 1 ' IR: iLOL'^AVFALAT! MlKTYf .1 .1 ' I KA 
I ^vniEAPRMRTPYY f KMTDTY' ,fl::wt I t II.WK'IT' / JK t FYTEF IKK AC, I 



118 



CPn_0')H7 1L1743 3 H3H115 

ytgB-tike predicted rRNA mer.hyi.dse 

LENGIFA [GFFMFAYRTLLTf^^^/QV3H£IFKTTVVPGlm ^ tDATCGNG^^DSLFLARLLQ 
GEGRLWYD IQKEALSNALLLFETHLSEQERSVI EMKEQSHEH ILEKDVKLI HYNLGYLP 
KGNKE ITTLARTTE 1 3LEY ALN I VRPDCL ITWCYPGH PECEKETHSVE3LAQRLH PKEW 
CVS3 FYVAN R CRAPRLFIFQ RQGS ES 3V DKG 



■si.;-" !';■!■ :: ,\. ■/ Uy ,-."/} i!m in.n. 1 :'■■!'! - i • 

K F F I NL I NLDQG I LKMK EAA PMH F PF PVR RSVWLNR YST FR I GGPANY F KAI HT I E EARE 

V r RFLHS INYPFLI IGKGSNCLFDDRGFDGFVLYNAIYGKQFLEDARIKAYSGLSFAALG 

KATAYNGYSGLEFAAG I PGSVGGA I FMNAGTNESD I SSWRNVET INSECELCSYSVEEL 

EL3YRSSRFHRQQEFILSATFQLSKKQVSADHSKS I LQHRLMTQPYTQPS AGC I FRNPEG 

TS AGKL IDAAGLKGLAIGGAQ I SPLHANF I INTGKATSDEVKQLIAI IQSTLKTQG IDLE 

HEIRIIPYQPKIHSPVSEK 

CPn_0989 1139552 1139016 

CT832 hypothec ical protein 

LRTSIAVKCVLLTIFWIiVMATLSPEBCFSGSPISISKEFPQQKMREIILQMLYALDMAPS 
AEDSLVPLLMSQTAVSQKHVLVALNQTKS ILEKSQELDL I IGNALKNKSFDSLDLVEKNV 
LRLTLFEHFYSPPINKAILIAEAIRLVKKFSYSEACPFIQAILNDIFTDSSLNENSLSI 

CPn_0990 1139830 1140440 

infC-Initiation Factor 3 

SVALNFK INRQ I RAPKVRLIGSAGEQLG I LAIKDALDLAREAGLDLVEVASNSEPPVCKI 
MDYGKYRYG LTKKEKDS KKAQHQVR I KEVKLKPN IDEND FSTKLKQARTFVEKGNKVK I T 
CMFRGREIAYPEHGFKWQKMSQGLEDIGFVEAEPKIJ^RSLICWAPGTVKTKKKQEKS 
HAQDENQ 

CPn_0991 1140394 1140612 

rl35-L35 Ribosomal Protein 

KQ RKNRK S LMPKMKTNKSV S AR FKLT ASGQLKRTRPG KRHKL S KKS SQEKRNLSKQ P LVD 
KGQVGMYKRMMLV 

CPn_0992 1140622 1140996 

rl20-L20 Ribosomal Protein 

GKLVMVRATGSVASRRRRKRILKQAKGFWGDRKGH I RQ$ RSSVMRAMAFNYMHRKDRKGD 
FRSLWIARIJWASRIHSLSYSRLIKGLiCCANISLNRKMLSEIAIHNPEGFAEIA^AKKA 
LEATV 

CRT^0993 1140975 1142030 

"pftsi'S-Phenylalanyl tRNA Synthetase, Alpha" 

KS F&SHSLG IRI SMEMKEEIEAVKQQFHSEIX>QVNSSQAIJUDIJCVRYU3KKGIFRSFSEK 
Lk^d^KAPCI^SL INDFK1TVED3XQ EKSLVLLASEQAEAF SKEK I DS S L PGDSQ PSGGR 
HI IjKS I LDDWD I FVHLGFCVREAPN I ES EANNFTLLNFTEDH PARQMHDT FYLNATTVL 
RTHTSNVQARELKKQO P P I KWAPGLC F RNEP I S ARS HVLFH QVEAFYVDHNVTFS DLT A 
I LSAjFYH S F FQ RKTELRFR HSYF PFVE PG I EVDVSC ECCGKGC ALC KHTGWLEVAG AGM I 
Hpp3^RNGNVDPElYSGYAVGMGIERLAMLKYGVSDIRI,FSE2roLRF 

CPa3>994 1142371 1144440 

CT&$7 hypothetical protein 
LFI^HRGGRMKRSRRNFEQALENLEKLKEISLATSND^ 

EAlif&VENYLLE I SCVSKSHADKALKESDFLI AGVQNVFSFLENQEDLYKSLLDEYSEVT 
KAYDEVTCK^KEVPTYDLSTDEETEEHK^ 

NDfcLVQ I IYKQNKLHETVNEGDPLTKTLLWNSEEVKNIASSLVIVNDMPLRLFYQRALSH 
LD J EAVVKVHNA VMAL F F S RY EATMVF KS P KKHN I WYFN DF LLFIJ* EAWKDLNNNVI D S Q 
ERKQTKLLASALSLG I FESKLVFEEASRYLYFNIQTKLENANGKKPLSPGQYLTDAYEEL 
HRLISKYPNGPLFKAMDRVLEHESRPYDPMI LGILPSLEGTLKLHGKS I DI I RSPS PVTQ 
SS 1 LYANCNEEFLGFLNAKAHRSEVTLVLNIQNRI SRKERARSRVI EEALEQEEHAPYVH 
AFS^EPEELLQNLESIHGDIETFAIJFFSILQEEFHKPLLASSFFLTKELKEFVGSFLKE 
KLTALKDI FFAKKKI LFRNDKLLLLHLLS YLI VFKL I ERTNPNS IVWSKDGLDYVSVF I 
AG t*iF FSREAFWDEHSLKLLLTNVLSPTLVAP.DRLVFVSH I ELLSKFVNC LKKNRQG FSS 
LKgFFKDDIEGWEFTGYLHELTEVSHKHNL 

CPn^995 1145515 1144415 

CTS.U hypothetical protein 

RMLiWKRHLLTRFWFALTSLLVLAL I FYAS IHHSLHTLKGASTAASGASVKLS I LYYLAQ 

I SfifcAEFLMPQLVA VATTSTLFAMQNKRE I ILLQASGLSLKSLMHPLLLSGAVIMMVLYA 
NFQWLHPICEKI S ITKENMDRGTTDKEQGKI PALYLKDQTVLLYSS I EPKTLTLNNVFWI 
KDPKTIYTMEKLAFTTLSLPIGLNVTQFFANDSENLELKEFFDMKEFPEIEFNFYENPFS 
KLFSAGNKNRLSEFFKAI PWNATGLGLSTQVPQR ILSLLAQF YYVL ISPLACMAAI ILS A 
YLCLRFSRT PTVTLAYL I PLGTVN I FFVFLKAG IVLASSSVL PTLPVMAFPL IVLFLLTN 
YAYAKLQ 

CPn_0996 1146592 1145519 

CT839 hypothetical protein 

AM P I LWKVL I FRY LKT AAFCTL S LICISIISSLQEI VAY I AK DVPY DTVLRLMAYQ I PY L 
LPFILPGSCFVSAFSLFRKLSDNNHMTFLRASGASQSI IMFPVLMVSGAICCLNFYTCSE 
LAS I CRYQTCKE I ANMAMT S PALLLQTLQKK ENNR I F I AVDHCAKS K FDNV I VALKGNNE 
[SHVGriKSIIPDTTKDTVKAKDWFISKLPDSLTESSSPSSQRFYIETLDELLIPKITS 
TLFAGKSYLKTRTDYLPWKQLVKQSLKHSHLPETLRRVAIGFLCITLTYAGKILGIHKPR 
FRKSIALYFIFPILDLILLIVGKNTKNLPLAFMLFVFPQLVSWWFAARAYRESRCYA 

CPn_0997 U46699 11476b4 

mesJ-PP- loop super tamily ATPase 

AYKMVL3SDLLRDDKQLDLFFASLDVKKRYLLALSCG3DSLFLFYLLKERCVSFTAVHID 
HOWRSTSAQEAKELEELCAREGVPFVLYTLTAEEQGDKDLENQARKKRYAFLYESYRQLD 
A(JG EFLAHHANDQAETVLKRLLESAH LTNLKAMAER3YVEDVLLLRPLLKIPKS3LKEAL 
DARCU^YLQDPSNEDERYLRARMRKKLFPWLEEVFCFNTTFPLLTLGEE^AEL^EYLEKQ 
A0PFF3AATHQD. , ^ELPCPDCLr00AFLCKWVMKKFFNNACtAV:;RHrL0MV , t'DHLSRG 
:;CATLRMRNK[V[ IKfXJVWTD 

* *I m_0'J.'>>3 I ['179 H 1LS05R-1 

I I ;;H ■ AT'l'-ili-ponti.Mit zmc pioce.^e 

Ll'.'A ,W K FM : ; K D K KMK P E P K KM F FT 'V F F P L L F' JWF r ;7Y A KO N F f .AG K K -\RVGF: ) HQ I EH 
[ .VNLHL I VrED:'HK [ALNDNLVSF< ICRFRDVQTOFCOt.P YHYLEL [DCX1HRLDLDL0ET3 
KrU/rrU/IK^VI'NlUI.WPSALU^SPrPE^YAI^^T'^^/'^-Vt/tHZPLWTCPATPQLINL 
ICrCoi^YCl'LSR^PrALRTYr.SOLYCLIGKYLSrVI S> IG'JFTLKRELKDLYQQVEVSLTQ 
F.T r/P RA A Y* P L YG(J V I .: ! T LN R I : W L W3 EGO ER FS<J LP .'J V R L, Y R E f *WNK Y H K LV FA R DLN 
OAQ I.F.KL.Rt !F.L; KjTVWY FNNQ E LS S R S L E KQ D P E V P ; H W F AC. A K E I WT -\ F KF Ni 1 ■ : U1 F K A 



PDQPRNLVLEKTFK3QEP3PHYLCYLFTFLF E ILVLLFVYLVF;JRQMRGMSG3AMSFGKS 

PARMLLKGQNKVTFADVAG I EEAKEEL I E I VDFLKNFNK FTS LGGR I PKGVLLIGPPGTG 

KTLI AKAVSGEADRPFF3 I AC3DFVEMFVGVGA3R I RDMFEQAKRNAPC I IF IDE! DA VG 

RHRGAGIGGGHDEREQTLWOLLVEMIXjFGTNEGVtLMAATNRPDVLDKAL 

VMNLP D I KG RFE I LMVHAKR I K LD PTVDLMAVARST PGASGAD LENLLNEAALLAARKDR 

TAVTAVDVAEARDKVLYGKERRSLEMDAEERKTTAYHESGHAVVGLCVQHGDPVDKVTI I 

PRGLSLGATHFLPEKNKLSYWKKELYIDQLAVLMGGRAAEErFLGDTSSGAQQDISQATKL 

VRS^'CEW^MSPQI^^m^DERSDGLTCYGGYHEKSYSEETAKTIDTELRi^LDAAYQRA 

. . 'j: n \; :elmt,"L. i ?\'<ziwr i <'r r \?7 / ef kkjj^cl 

CPn_0999 1152859 1150766 

pnp-Polyr ibonuc leot ide Nuc leot idy It rans f erase 

QETFMNFCTISI^TEGKILVFETGKIARQANGAVLVRSGETCWASACAVDI^DKVDFL 
PLRVDYQEKFSSTGKTLGGF I KREGRPS EKE I LVSRL I DRSLRPSFPYRLMQDVQVLSYV 
WSYDGQVLPDPLAICAASAALA I SDI PQSNIVAGVRIGC I DNQWVINFTKTELASSTLDL 
VLAGTENAI LMI ECHCDFFTEEQVLDAI EFGHKH IVT ICKRLOLWQEEVGKSKNLSAVYP 
LPAEVLTAVKECAQDKFTELFNIKDKKVHAATAHEIEENILEKLQRE0DDLFSSFNIKAA 
CKTLKSDTMRALI RDREIRADGRSLTTVRP IT I ETSYLPRTHGSCLFTRGETQTLAVCTL 
GSEAMAQRYEDU^EGLSKFYLQYFFPPFSVGEVGRIGSPGRREIGHGKLAEKALSHALP 
DS ATFPYT IRI ESNITESNGSSSMASVCGGCLALMDAGVPISS P I AG I AMGL ILDDQGAI 
ILSDISGLEDHLGDMDFKI AGSGKG ITAFQMD IKVEG ITPAIMKKALSQAKQGCND ILNI 
MNEALSAPKADLSQYAPRI ETMQ IKPTKI ASV IGPGGKQ I RQI I EETGVQ IDVNDLGWS 
ISASSASAINKAKEI I EGLVGEV^VGKTYRGRVTSVVAFGAFVEVLPGKEGLCHISECSR 
QR IENI SDWKEGDI I DVKLLS INEKGQLKLSHKATLE 

CPn_1000 1153193 1152891 

rsl5-S15 Ribosomal Protein 

SAFAAIILRRHPMSLDKGTKEEITKKFQLHEKDTGSADVQIAILTEHIAELKEHLKRSPK 
DQNSRIAIXKLVGQRRKLLEYLNSTDTERYKNL I TRLNLRX 

CPn.1001 1153369 1153869 

yfhC-cytosme deaminase 
YYLELGGEKLn^MEKDIFFMQQAFKEARKAYTX2DE , /P 

DATAHAEILC IGSAAQDI^NV^TT.DTVLYCTLEPCLMGAGAIQLAR I PR XVWAAPDVRLG 
AGG SWVNI FTEEH PFH WSCTGGVC SEEAEHLMKKFFVEKRREKS EK 

CPn_1002 1153844 1154089 

CT845 hypothetical protein 

KSA£FIKVKNKIWU^LYEQ2SSRLQK^ 
EGVLSGIGEVRAAILAALSQEN 

CPn_1003 1154862 1154092 

CT846 hypothetical protein 

TSNKT IHPLLWGPDRQIAGKASMRVI FPDKHNNFPNLSKLLKKLPSVILVTSCIAPFFSY 
IINKFFGIPGLLEILALSVKGIQKHHFV^FLTYPLITADSLSL^HCDQSFEITQRLLI^^ 
LDFFLFYKAIQHLIRKLGAFSVLWI SGQALI IGAVLW3FMALIHSSQSFFGPESIICGV 
LTVQIFIXJPEKRFTIGPTPLSVSIKWGFLFVLGFYCCILIFSGAFLLLIjASMIAIV^ 
FCKKEKI PNPYTTSLRF 

CPn_1004 1155418 1154879 

CT847 hypothetical protein 

HLS I EELMS IQPVSNTTTKADKVI PDSTKVISDS ITINKQSAFYFC ISVMLRLSESTTEY 
G K S I LA VLE DNT I VQQCRVKEL I NL P LLKVPDLQKKDG S DDEYKNQNE I Q AYQ S SNQQ I S 
ANRQM I QQELSS AQQRAQANQKSVNSTT I ESMQ I LQATS SML STLKELT I KANLTNS PS D 

CPn„1005 1155957 1155415 

CT848 hypothetical protein 

NRKPVRLNMWI IDPLSAKKPLQAAINVPGTPITGGPNTATADDI IAKFSKDSNPLIVTVY 
YVYQSVLVAC3DNLS 1 1 AQELQANS S AQTYLNNQEALYQYVS I P KNKLNDNS S SYLQN IQS 
DNQAIGASRQAIQNQI SSLGNAAQVI SSNLNTNNNI IQQSLQVGQALIQTFSQ I VSLIAN 

I 

CPn_1006 1156493 1155990 

CT84 9 hypothetical protein 

TKVNFFIMSITTLGTLPTVOTINSSRPPLEPl^PKIGAVLFSIYELLLQArEIROQTVL 
TQSQQLNDNTNIQQQLNQETNO IKYAIVSAGAKEDE ITRVQNQNQNYSAQRSNIQDELVT 
TRQNGQI ILSHASTNINI IGXX>SSQDSSFIKTTNSIGSTVNQLNKPLG 

CPn_1007 1156689 1156907 

CT849.1 hypothetical protein 

LWY KS LAG E EKDVSGN ECNDY P EVFKDDVSAYVLVTCGOMSS EGK I QVEMTY EGDP AVI S 
YLLTKARDSLDES 

CPn_1008 1156904 1158223 

CT84850 hypothetical protein 

VL NY S F IGM LK PMYVL S K R L YRWVNQ L I KLGDLVKNS R3 F S VEWVF I S ALLL I FGCLGCA 
SWKVS LVP FLLLFS FLAFPL I LC FRGKGYALLLGVPjTLYVAKYWGETLYVS FWLSGL 
GVSFLLAFGLFLQGVWLAQEEEMVKGKEQLRLSEDLDAQRSAYEDLLLTKSQEKEFLDAR 
AQGLDP ELTECQELLKAAYOKQEYLT IDLKI LADQKNSWLEDYAELHNKY I ELVSKNGDV 
VFPWVAEPSVGESQGSERVDVSRWVSALQEKEESLERLRNEILVEKORCSDYEHRCQELG 
LLLQNFTALERRCEELONLLNQKETQINELHQLVCKSEEKVSVEPSAHAETSCVEEKQYK 
^LYSQLOEQFLEKSETLSLVRKKLFAVQEKYLTLKKKEELTKODISFDDtSMIQGLLERI 
EILEEET/SHLEELVSRSLSL 

CPn_100 f J 115^085 1158L86 

map -Mfithion ine ^ininopept idase 

YRLLHPVrLMKRNDPCWCGSGRKWKQCHYPQPPKMSPEALKOHYASQYNILLKTPEQKAK 
T YNACO [TARILDCLCK.*V?OKGVTTNELDELSQELHFKYDAI AAPFHYGSPPFPKTICTS 
UlEV \(AV\ [ FND t PLKCVJD IMN t DV.SC I V DG Y YG DC Z PMVM [GEVPEI KKKTCQAALECL 
MD'; [A I LK PC I K1FA I EARADTYGFSWDQFVGHGVG r EFHENPYVPH YRNRSMI P 

lak;m ; ' t i epm r wokkeiswdpknlweartcdmop'jaowehtia itetgyetltllnd 

1 t'n_|fJt f j 1 ".■)(,/'-, 1 I1'M)|, / 

' T'lS." h/(M>t ti> r u .i L piorcui 

VMM M.fJL .:[,!.■ VVLP[V[ I I VKVALLKNF'IR KKOOP VII. RFC I. FA r/'.AMLFVTFGR 
'A-TOFl.hl-'.LYW Of ^^:!'L[,fTVr;t KMMLAPMPEKAKDEyriKTrPrrFPLArPVtTOPA 

■/iTAi.i/^Mn r,i f 7urr rrp\M[ r AWArru.i'Tr.LC.T.T-TnpLF'iNFGLLALEPLPniAL 
rj,M';vnr,MLK*.; i lai-nu \v\ u\ 

' l-ti_101 1 I ].m! Ifj'i ! l','l'K].> 



TTH'j i hypothec ical proton 

EMR LKNY PM I 0F3FFL PQTC I LLLA3 D3LTN I LALH I JLLANYSVKQ RMLVLLRESFFAF I 
AMFALVGLAUXJLKVLNTPVCA I EWGGI AVTLAGVRAVLRLGKEESW I PYKFNMS PSYS 
PCI "P TALPLMFGPSG 

CPn_L012 U62220 1160421 

yzcB-ABC r.roncporter permease 

A CFfiL ITSKMKKKF [FYFVTVF^LLFLWE^SRHPPT Tr ^FF r PPPfs'TIAo'iTLOSLPLLL 

:MAv:v:-i"i;r:;-'"r,".' r v- r~*- _ : - ■ \?vry -'Lnrr»\Lr 

HIF-SGLKIArGSAGFAAIAGEWVASOSGLGILMLESRRNYEMELAFAGLATLSILTLSLF 
0 ITLL X EKL I FSLFRVKRMSLKHKSVAKKALSVLAL I P IMLI PWKGNSKS PPDKKNLTSL 
TLLLDWTPNPNHIPLyAGVAKGYFKQHGLDWU5KNTDSSSAVPHVX.FEQVDMALYHALG 
I MKTS I KGM P IQ IVGRLI DSSLQGFL YRSQ D P I YKF EDLNGKVLGFCLNNSRDLNR r FT 
LiNRNGWPSEVKhfl/SSDLISPMLLNKIDFLYGAFYNIEGVKLQTL^ 
TGPQLIVFTKKGTKASEPEIVEAFQKALQES I IFSKDHPEDAFKLYAKETKSIPKNLYQE 
YLQWEETFPLLAQSQDPLSKDLVDKLLET I IKRYPELASEVAKFSLNDLYNPSLPEEGSV 

CPn_1013 1162209 1163624 

fumC-Fumarate Hydratase 

RENSI^HRGNIDMRQEKDSI^IVEVPEDKLYGAQTMRSRNFFSWGPEI^PYEVIRALVWI 
KKCAAQANQDLGFLDSKHCEMIVAAADEILEGGFEEHFPLKVWCTTCSGTQSNMNVNEVIA 
NLA I RHHGGVLG SKDP I H PNDHVNKSQSSNDV F PTAMH I AAVI SLKNKL I PALDHM I RVL 
DAKVEEFRKDVKIGRTHLMDAVP>TTD3QErSGYSSQLRKCLESIAFSI^LYEIA:GATA 
VGTG LNVP EG FV E K 1 1 HYL RK ZT D E P F I P ASNYF SAL S C HDAL VDAHG S LAT LACALTK I 
ATDLSFLGSGPRCGLGELFFPENEPG3S IMPGKVNPTCC EALQMVCAQVLGNNQTVI IGG 
S RGNF ELNVMK PV 1 1 YNFLQSVDLLS EGMRAF S EFFVKGLKVNKARLQDN INNS LMLVTA 
LAPVLGYDKCSKAALKAFHES I SLKEACLALGYLS EKEFDRLWPENMVGNH 

CPn_1014 1165456 1163732 

ychM-Sulfate Transporter 
ALASTI/TCIVK\TCWFKNFIPKLYTSIK 

GVGVSP IQGLLAS I IGGLLASAMGGSNVLISGPSSAFIS ILYCLSAKYGAEALFTVTLLA 
GVFL I AFGLTGLGTF IKYMPY PWTG LTTGLA I IIFSSQ I KD FLGLQMG AN I PADF LPKW 
IAYWDHLWTWDSKSFAVGLFTLLIMIYFRNYKPRYPGWIAIVTATTLVWLLKIDIPTIG 
SRYGTLPTAIPLPKIPQLS ITKILQLMPDALT IAVLSGLETLLSAWADGOTGWRHQSNC 
QLVAQGVANIGTSLFSGI PVTGSLSRTAAS IKSEATTPI AGI VHS I FICFILLLLAPLTV 
KIPLTCLAAVLILIAWNMSEIHHFIHLFTAPKKDI^ 

AAFLFMKQMSDLSDVISTAiGfFDKDSDFLSKAEVPQNTEIYEINGPFFFGIADRXiKNLLN 
DiFJCPPKTFTirMTRVPTTnA.qAMHAr.FFFFT.FrnRnr:TTTJj.Anwyrpr.ADLKT?vFT.n 
ELIGVDHIFSNIKSALLFAQALTNLESKTSTRHLV 

CP£|l015 1165550 1166893 

CTl|7 hypothetical protein (possible IM protein) 

KNt$$KN FS FFTSVRVRS KVDHE 1 I LEVTMLKLQLCALFL FGYLAI VFEH IVRVNK S AI AL 

AMQGLMWLVCFSHI PMADHMILVEEIADMSQVI FFLFSAMAIVELIDAHKGFSVTVKFCR 

IQ^RTLLLWALIGLSFFLSAALDNLTS I I I I I S I LKRLVKAREDRLLLGAICVIAVNAGG 

AV^^jLGDVTTTMLWINNKI TSWG I IRAL FVPS LVCVLVAG FCGQ FFLRKRGSTLI AKDVE 

LG^ATPKSLWIIFIGLGSLI>IVPVWKACLGL 

HLsR^PH ILTK IDISSITFFIGI LLAVNALS F ANLLTDF SLWMDK I F SRNWAI VIGLLS S 
VLDNVP L VAATMGMYT L P LD DT LWKL I A YAAGTGG S I L I IG S AAGVAFMG LEKVDFLWY F 
KR&.SWIALASYFGGLFSYFVLESLNFFI 

CPfjOlOie 1167027 H63898 

hypothet ical protein 
KRES/raKKGKLGAIVFGLLFTSSVAGFSKDLTKDNAYQ 

FGWDLSQQTCXJARLQLVLEEKPTTNYCQKVLSNYVRSUTOYKAGITFYRTESAYIPY^ 
LSEDGHVFVVDVQTSQGDIYLGDEILEVDGMGIREAIESLRFGRGSATDY3AAVRSLTSR 
SAAFGDAVP SG I AMLKLRRPSGL IRST PVRWRYT P EH I GDFS LVAP L I PEHK PQLPTQSC 
VLF^GVNSQSSSSSLFSSYMVPYFWEELRVQNKQRFDSNHHIGSRNGFLPTFGPILWEQ 
DKGPrYRSYI FKAKDSQGNPHRIGFLRISSYVWTDLEGLEEDHKDSPWELFGEI IDHLEKE 
TD AL"l I DQTHNPGGSVFYLYS LLSMLTDH PLDT PKH RM I FTQ DEVS SALHWQDLL EDVFT 
DE^AVAVIGETMEGYCMDMHAVASLQNFSQSVLSSWVSGDINLSKPMPLLGFAQVRPHPK 
HQYTKPLFMLIDEDDFSCGDLAPAI LKDNGRATL IGKPTAGAGGFVFQVTFPNRSG I KGL 
SL^^I^VRKIX3EFIENLGVAPHIDU5FTSRDLOTSRFTDWEAVKTIVLTSLSENAKKS 
■ EEOXSPQSTPEVIRVSYPTTTSAS 

CPn3017 1168997 1169935 

lyfeg^Metalloprotease 

VII >?RKLI LCN PRG FC SG WRA I QVVEVALEKWGAP I YVKH E I VHNRHWNALRAKGA I F 
VEELVDVPEGERVIYSAHG I PPSVRAEAKARKLI D I DATCGLVTKVHSAAKLYASKGYK I 
ILIGHKKHVEVIGrVGEVPEHITWEKVADVEALPFSSDTPLFYITQTTLSLDDVQEISS 
ALLKRYPSI ITLP3SS ICYATTNRQKALRSVLSRWYVYWGDVNSSNSNRLREVALRRG 
VP AD L I NN PED I DTN I VNH SG D I AMT AG AST P ED WQAC I RKLS S L I PGLQVEND I FAVE 
DWFQLPKELRCS 

CPn_1018 11693 a 5 117062° 

No robust homolog present m Geneoank/EMBL as of 11/7/98 
RMSYFN-/QKNSWLRCLGLLAKFFSRLLYPVFFSFREGIYLFSSLYLKYPRLFFYDLGKY 
VY3LRHC P Y AKLGRLPCASLLKEGNVYGETPWSVLAK ICQAFD ITSQD I LYDLGCGLGKV 
CFWFSH WRCQV IG IDNQPHF I RFSSNMHRKLSSGFALFDTEEFKNWLSQASYVYFYGS 
3FSRRLLNEI IL.KLSEMAPGSWIS ISFPLDSFSRGKECFFTEKSCSVRFPWGKTIAYKN 
[RKG3 

CPn_10l9 1172146 1170&3S 

CTHt',0 hypor.hec ica 1 protein 

[ HRPJM I MTVSYQ3 ISTPPPECEFDIFVDGNATEELV/*/AAEVQVALPAGEOYAMLRATSEL 
CFG ILTQJjECALTUALPPKEKPLOEEQFLVKNG ILJ^PSTSLPNLKPGQSCOTSLASHRNP 
LAQC::T:JJJNG'IVlKAi:TCrrSSSFPFF5CK\PECDSC;VDKTFTVGVQTPKA0E0QEA5ASQ 

::oaofhvrsy:;l;stikch^akekvsostk\;actokhtotk:;datl:ipmsly:;tlhkevpo 
alj j : ;tk :;fA>K rji- v.i irdorooec y eq eq ec eegkkkt pwctves wt-: j ^nq'/y eg ytp i 

r PDP EVRFALiJE^.JVL.AGKRVTNLDVl/? E CTELJ4K 1*1 LKGRANDTMTRLEEP ELMER E 
AHEI AA; JY: JRfJAKYARWI A ) [ ATATLG I LGA T APMVGE i:JGD r J l\* IFVCK I JGPTKDATAK 
TFFKGl' ;KV['T:;L:TOr-M:A J \:;KVfinLJC''\VRAV\rVPKnvrRMP(jnEVTRTrCEVKDNW 

k : ; m i ;rj r ! , [ .n r lot !•: 1 1 u a a m i.t o 

iTruLOiil) 1 1 / ii.-i t L 1 ls,» 

iTHf. 1 liypot he! ii i I pnir.'ui 
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4^ 

DSEQOELLQSRREER.jETYANQOSSEKK I ETKVC I KDLCKDLF SQDOCSNCrGKKSPFQC 
0T3RKNRIAKAAQAVPVrPPPSIGVFTI^YLLTKCG [LGDFoSYGCHKDSVESTQRELDA 
LHEKR lETIKVSI EKEKRERLWCSLSD I ICWLAPFVG IG IG IVAI L3GGG IFAFAGFFAG 
LI5LVIKCLEKLKFWDWLEKHLPIKNEELRRKI ITI IQWVVYLTPVILSICTLKVENLGF 
S P 1 1 EGA I KG IQPA r ESTMAALRCAZ LFSQAE I YKLKGKLTK I QLD I ELKSFDRDDHYER 
SOELLDNMEGSFEALoRILNYMRELDQVYLHGLRG 

CPn_102L 117427Q 11736^ 

TYQKIFKISEEDLEKVYKEGYHAYLDKD'iAKJITVFRWLVFrNPFVSKFWFSLGAJLHMS 
EQYSQALHAYGVTAVLRDKDPYPHYYAYICYTLTNEHEEAEKALFJWWVRAQHKPLYNEL 
KEEILDIRKHK 

CPn_1022 1175709 1174216 

CT863 hypothetical protein 

FS FFFYALXLQ IMNMPVPSAVPSANITLKEDSSTVSTASG I LKTATGEVLVSCTALEGSS 
ST DAL 1 3LALGC IILATQQELLLOSTNVHQLLFLPPEVVELEIQVVDLLVQLEHAETITS 
EPQETQTQSRSEOTLPCXJSSSKQSALSPRSLKPEISDSKC^QALQTPKDSAVRKHSEAPS 
PETQARASLSQASSS SQRS LP PQ ESA PERTLLEQQKAS S FS PLSQF S AEKQKEALTTSKS 
HELYKERT^DRQCREOHDRKHDQEEDAESKKKKKJCRGLGVEAVAEEPGENLDIAALIFSD 
QMRPPAEETSKKETTFKKKLPS PMSVF SRF I PSKNPLSVGSS I HGPIQT PKVENVFLRFM 
KLMARI LGO AEAEAN ELYMRVKQRTD DVDTLTVL I S K I NN EKKD I DWS ENE EMKALLNRA 
KEIGVTIDKEKYTV/TEEEKRIXKENVQMRKENMEKITQM 
VLKLLKELMDTFIYNLRP 

CPn_1023 1176008 1176331 

No robust homolog present in Genebank/EMBL as at 11/7/98 
GLDFLEIFIMKKWTLSI I FFATYCASELSAVTWAVPLSEAPGK IQVRPWGLQFQEEQ 
G SVP YS FYY PYDYGYYY PETYGYT KNTGQE S RECYT R F EDGT I FY EC D 

CPn_1024 1177317 1176334 

xerD- Integrase / recombmase 

IFFFPWFSI^SUaAPLPILKLHSI^S>rraPSTQFKTT^ 

YRQD I S SFLT I SAI SS PQD ISQNSVY I FAEEL YRJ^EA£TTIjARiUj I ALKVFFLFLKDQQ 

IiPYPPIIEHPKIWKRLPSVLTPQEVDAIJ^VPLQMEKNPRHLAFRI>TAIL^ 

VS ELCDLRLGHVSDDC I RVTGKGSKTRLVPLGS RAREAI DAYLC PFRDQYOKKNPH EDHL 

FLSTRGHKLERSCVWRRIHNYAKOVTSKPVSPHSLRHAFATHLI*DNKADLRVIQE 

RIASTEVYTHVAADSLI EKFLAHHPRNL 

CPn_1025 1177266 1178879 

pgi-Glucose-6-P Isomerase 

GAEQFSSYREKTMERXRFIIXDSTKILQELAIJJPI^LTAPGVLSAERIKXFS 
SFATERIXiDAILAALISIAEERGLHESMIJ^MC^GQVVN^ 

DSSFTGEAEDIAVRSRVEAQRLKDFLTKVRSQFTTIVQIGIGGSE1>GPKALYRALRAYCP 
TDKHVHFI SNI DPDNGAEVLDT I DCAKALVWVSKSGTTI ETAVNEAFFADYFAKKGLS F 
KDHFIAVTCEGSPMDDTGKYLEWHLWESIGGRFSSTSMVGGVVLGFAYGFEVF^ 
ASAMDQ IALQPNARENLPMLSALI S IWNRNFLGYPTEAVI PYSSGLEFFPAHLQQCCMES 
^KSIAQDGRRVGFSTSPVIWGEPGTN^HSFFQCLHQGTDriPVEFIGFEKSQKGEDIS 
FGGTTSSQKLFANMIAQAIALACGSENTNPNXNFDGN^ 

ENK I VFCGLCW3 INS F DQ EGVS LGKALANRVL ELL EG AD AS N F P EAAS LLTLFNIKFR 

CPn_1026 1178961 1179137 

ltUA 

CS FGFGKI CEDRMFF I AVRSRGFLD I HG I LAARKGKQWKSTAGAW IGSRGAVFYS LVS 
CPn_1027 1179172 1180755 

No robust homolog present m Genebank/EMBL as of 11/7/98 
NMPGSVS S PPLS PVI VRERVPS S SG S DL I QPHAVLK I S I L I FALVT ILG IVLWLSS ALG 
AL PS LVLTVSGC I A I AVGL I G LG I LVT RL I LST I RKVDAMGY DAAVKEEQYLSR I RELE S 
ENRE I RCRNRAVEDCCAHLS EENKDLRDP EYLHGMT ERL I AS LEI ENQALVAEN I LLKDW 
NASLSRDFRAYKQKFPLGALEPWKED I AC IMEQNLFLKPECI AMVKSLPLETQRLFLYPK 
GFQSLVNRFAPRSRFFCTPKYEYNSRNENEIX3KVAAVCARIJCKEFFSAVLGACSYEELGG 
I C ERAVALK ETL P L P EAVYDTLVQ EF PNL LT AE SLWKEWC FY S Y PY LRP YL SVDYC KRL F 
VQLFEELCLKLFTTGS PEDQALVRLFSYYRNH I PAVLAS FGLPPPETGGSVFVLLPKQEN 
LLWSQI EVLATRYLKDTFVRNS EWTGSFEMMFSYNEMCKEI SEGRI RFAEDYETRHSEEF 
PPSPLSEEGEGEEFLPPCSEEEVSVLERPDLDVDSMWVWHPPVPKGPL 

CPn_1028 1180995 1181999 

mdhC-Malate Dehyrogenase 

FFLKGVRMAFKEWRVAVTGGKGQ IAYNFLFALAHGDVFGVDRGVDLR I YDVPGTERALS 
GVRMELDDGAYPLLHRLRVTTSLNDAFDG IDAAFL IGAVPRGPGMERGDLLKQNGQ I FSL 
CGAALNTAAKRDAKIFVVGNPVNTNCWIAMKHAPRLHRKNFHAMLRLIDQNRMHSMLAHRA 
EVPLEEVSRWrWGNM SAKQVPDFTQAR I SGK PAAEV IGDRDWLEN I LVHSVQNRGSAVI 
EARGKSSAASASRALAEAARS I FCPKSDEWFSSGVCSDHNPYG IPEDL I FGFPCRMLPSG 
DYE I I PGLPWEPFIRNKIQI SLDEIAQEKASVSSL 

CPn_1029 1181987 1182844 

No robust: homolog present in Genebank/EMBL as of IX -7/98 
RVFVISTMLWGVSMROSFDELSQNAFKNIFNKQPFCFIFCSLCCFGFVFALFLKLCSRLA 
P E I S LSTLG LGA F FC A FS V I CAS A 1 1 VQF LLH KESQG ET S KLCCA I KNTWS S LWLS LLVS 
MPFF I AMVAWTVAMLSS FLGSLPWVGKLFHTVL I FI PY LSATAL I LLF LGS FSCLFFC I 
PV LHNQ ESIDYRKLLECFRGWI LRQF IG W I ALVPLALC 3WLA LDS FY LMTH LV E I AD I H 
TWSFLAOMFVLIVPIALILTPAVSFFFNFSFSFYLAKQEEEKALVK 

CPn_10;0 1183901 1182843 

predicted D-ami.no acid dehyrogenase 

FKVHFMRIAVLGAGYAGL3VTWHLLLHSQGTATIDLFDPT PLGEGASGM5SGLLHAFTGK 
KALK PPLADQG I NATHAL ITEA3KALNVP IV I SCO ILRPAI DEDQAOLFTERVEEFPKEV 
r^EKAPCFI<:rP3MVrPPNLGALFIK:JGVTL^IDLYr0GLADACMKU:T0FYDELIEDL 
AD [ EEFYDM I [ VTPGANA3 [[-PELKDMPVNKVK^QLLCIGWPKDLAMLliFJ TNAHKYMVA 
NT(JKN'T f LlAlATFnHNOPELTPDPArAYQEIMPPVL:;LFK]LKOAOVLiir\ -V IMR333K3 
RLPVT JP [REKf^Wt't^X .IXJi'KGLLY! IG rTGDML/'QAVLRK'/rAY I AKCTLKT [ 

' Pit_10 ; 1 II R' t r )f) 7 j iH4()'f^ 

,ircD Ac; in LiuviUn ir Ii lh^ Anr Lf/ntf-r 

I KrTFMT' )RTh\;:"KNf.( "I 1 LAIAf JMVV.T; I W \(\ I FSLPOf IMAATAt IACAV [ [...WrLTuTG 
MFI* [ANTFR I L.jTIRPDLKEGI YMY^PEGT-f Jt'Y IGFT IGW^JYWtA '0 11 7 INVGYAV [THDA 

i ny i rPE-vroi k INT Li pa i u x ; ; r i, rwvFt n* r vlcg r rqa:; i r nv u ;t ifk r e i l r r f i i l 

'I'AhTFKLAVI-'KTDf^ II iAVrKAOP'JLGT/':: lOI.K^ 7PMLVTLWAF U I [ I* * A V\'M ''J'M AKN P 
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GL t rAVL:J:;WLo^trVAEIPF *AAKNf7rFPEIFTIENKEK3PSV3LYITSSVMQLAMLL 
VYFSSNAWNTMLS ITGVMVLPAYLA3AAF LFKL5KSKTYPKKGS TKAPLAMITG ILGWY 
: i LWL I Y AGG LK Y LFMALVL LALG I PFY I DAGKKKKNAKTFFAKKE I VGMTF IGLLALTAI 
FLFLTGRIKI 

CPn_1032 L 186 153 1185566 

C7i7~i hypothec Lcai protein 

f..MAY^TPYPTr,AF!{^^rnErDDCMPPO p F=TF^D^ALLQAKTF^FT;TVPYTS T /LPKEL 

,■:■,"/:■:;■';■" ; ' • "va-mv,; " r._!!"!.r,r " *i "; r n-rr' T, i ' \-r ;f-l" 

ALGFLNFENAEPAKVN 

CPn_1033 1167656 1186187 

CT372 hypothetical protein 

NNKKKDYSGEFLTTDTVDS I AFLPSEENFCY I KT ILFFRVKKKHYAFFYGEFMI SFRFLL 
LSGIXALGISSYAETPKETTGHYHRYKAKIQKKHPESIKESAPSETPHHNSLLSPVTNIF 
C S H PWKDG I SVSNLLT SVEKATNTQ I SLDFS I LPOWFYP HKALGQTQALE I PSWQFYFSP 
STTWLYDSPTAGQGIVDFSYTLIHYWCTNGVDA^AAGTASSM^YSNREN^ 
QTFPGDFLTIAIGQYSLYAIIXSTLYDNDQYSGFISYALSQNASATYSLGSTGAYLQFTPN 
SEIKVQLGFQDSYNIDGTNFSIYNLTKSKYNFYGYASWTPKPSCGDGQYSVLLYSTRKVP 
EQNSQVTGWSLNAAQH IHEKLYLFGR INGATGTALP INRSYVLGLVSENPLNRHSQDLLG 
IGFATNKVNAKA I SNVNKLRRYESVMEAFAT I GFGPYI SLTPDFQLY IHPALRPERRTSQ 
VYGLRANLSL 

CPn_1034 1183589 1187732 

Predicted OMP (CT371) [leader (18) peptide! 

KTSWQKYKKYLSYS ILVQK I ARYVMKTWLFFTFLFSCSS FYASCRYAEVRS IHEVAGDI L 
YDEENFWLILDLDDTLLQGGEALSHSIWKSKAICXjLQKCXJ^ 

GTVQP I ESAIFLLIEK IQKCGKTTFVYTERPKTAKDLTLKQLHMLNVSLEDTAPQPQAPL 
PKNLLYTSG ILFSGDYHKGPGLDLFLEICTPLPAKI IYI DNQKENVLRIGDLCQKYGIAY 
FG ITYKAQELHPP IYFDNIAQVQYNYSKKLLSNEAAALLLRHQMHE 

CPn_1035 1190081 1188570 

aroE-Shikimate 5-Dehyrogenase 

WQLPLMVP IVHLQ IWRFSMIYYGVSVMLCATVSGPSFCEAKQQILKSLHLVDI IELRLD 
L I NELDDQELHTL ITTAQNP I LTFRQHKEMSTALWI QKLYSLAKLE PKWMDIDVSLPKTA 
. LQTIRKSHPKIKLILSYHTDKNEDLDAIYNEMIATPAEIYKIVLSPENSSEALNYIKKAR 
LLPKPSTVLCMGTHGLPSRVLS PLI SNAMNYAAGISAPQVAPGQPKLEELLSYNYSKLSE 
KSH IYGL IGDPVDRS ISHLSHNFLLSKLSLNATYIKFPVT IGEWTFFSAIRDLPFSGLS 
VTMPLKTAI FDHVDALDASAQLCESINTLVFRNQKI LGYNTDGEGVAKLLKQKNI SVNNK 
HXA-IVGAGGAAKAIAATLAMQGANLH I FNRTLSSAAALATCCKGKAYPLGSLENFKTID I 
I I&^PPEVTFFWRFPPIVMDINTKPHPS PYLERAQKHGSLI IHGYEMF IEQALLQFALW 
FP^.LTPESCDSFRNYVKNFMAKV 

CPiFia036 ' 1191180 1139984 

afuS-Dehyroqumate Synthase 
GY&KjPCSCRSCIIPTMUTTMMSCTIITTP 

VQ'lSHLLGP I LDHIKMLGYQVIVLTFPPGEPNKTWETFI SLQYQLVDQNI SPKSSI IGIGG 
. GT|^£HjMTGFLiAATYCRGLPLYL I PTT ITAMVOTSIGGKNGINLRGI KNRLGTFYLPKEVW 
MC^'FLSTLPREEWYHGIAEAIKHGFIADAYLWEFLNSHSKMLFSSSQILHEFIKRNCQI 
KA^VAEDPYDRSLRKIIOTGHSIAHAIETLAKGTVNHGQ^^ 

TP'Ql^IDQLERIJjKRFNLPSTLKDLQSIVPEHLHNSL.YSPENI IYTLGYDKKNLSQHELKM 
I^fUHLGRAAPFNGTYCASPNMEILYDlLWSECHVMRHC 

CpHj|L037 1192286 1191123 

aroC-Chorismate Synthase 

LHFSRGSRRSFLEELLRTSVSRSHYLVKVMKNSFGSLFSFTTWGESHGPSIGWIDGCPA 
GLELHESDFVPAMKRRRPGNPGTSSRKENDIVQ I LSGVYKGKTTGTPLSLQI LNTDVDSS 
PYlEKfeERLYRPGHSQYTYEKKFGIVDPNGGGRSSARETACRVAAGWAEKFLANQNIFTL 
AYjLSSLGSLTLPHYLKISPELIHKIHTSPFYSPLPNEKIQEILTSLHDDSDSLGGVISFI 
T S PI HDFLG EPL FGKVHALLAS ALMS I PAAKG F E IG KG FASAQMRGSQYTD P FVMEG EN I 
T LKSj>JNCGGT LGG I T IGVP I EGR I AFKPT S S I KRPC ATVTKTKKETTYRT PQTGRHDPCV 
AIRAVPWEAMINLVLADLVLYQRCSKL 

CPnJj.038 1192750 1192199 

aroy-Shikimate Kinase II 

WKL|LR^^TIILCGLPTSGKSSLGKAI^FLNLPFYDLDDLIVSNYSSALYSSSAEIYK 
AYGjR&KFSECEAR I LETLPPEDALI SLGGGTLMYEASYRAIQTRGALVFLSVELPLIYER 
LEKRGLPERLKEAMKTKPLS E I LTER I DRMKE I ADY I FPVDH VDKS SKS SLEQASODL IT 
LLKS 

CPn_1039 1194011 1192665 

aroA-Phosphoshikimate Vinyl transferase 

VCFTMLTYKVSPSSVYGNAFI PSSKSHTLRAILWASVAEGKSI IYNYLDSPDTEAMICAC 
KQMGAS IKKFPQ I LEI VGNPLAI FPK YTL I DAGNSG I VLRFMTALACVFSKE ITVTGSSQ 
LQRRPMAPLLQALPJ^FGASFHFSSDKSVLPFTMSGPLRS AYS DVEGSDSQFASALA VACS 
LAEGPCSFTriEPKERPWFDLSLWWLEKLriLPYSCSDTTYSFPGSSHPOGFSYHVTGDFS 
SAAFIAAAALLSKSLQPIRLRNLDILDICGDKIFFSLMQt^njGASIQYDMEEILVFPSSFS 
COS CDMDCC IDALPILTVLCCFADSPSHLYNARSAKDKESDRILAITEELQKMGACIQPT 
H DG L LVM P S P L YG AVL DS H D DH R I AMALT I \AL Y A SG DS R I HNT ACVR KT F PN FVQTLN I 
MEARIEECHDNYSMWSTHKRKVFARESFG 

CPn_10<10 1194S76 11^4073 

No robust hornolog present in Genebank/EMBL as of 11/7/98 
RPriOSLFLRTWCPSSSFREHTVCAAPLLYPRRRSPDYLFSPTGCPMSTTTVKHFIHTASR 
WE PVLK E T VA^NYWf 1 AQW I NT LS FLENSGAKK I SAS EH PTEVKEEVLKH AAE EFRHGHYL 
KTOIGRI:lETSLPDYTSKNLLGGLLTKY\'LKLLDLRTCRVLENEYSLSGQTLKTAAYILV 
TYAI ELRA^ELYPLYHDELKEAQSKITVKb I ILEEQGHLQEMERELKDLPHGEELLGYAC 
OFEGELGLOFVERLCQM IFDPSSTFTKF 

<:i*n_L04 I ! l-> ft 10 l 111 172^ 

*bioA A<tinu>;:y it i< >;i icte h Amino- / -Ox'.nori.inoat^ 

Am inut can:; t-:t.i: f 

I K-'RtUK f Pi .1 i I KKT: \ l\<t VFTLVMrLDr-IEPJ tl'RK ^HLFNCR^NALTKPOYLRYGCK 
WV : I Rf J r' I < A L AN : : U MO \ : Y N P UJ( X> 3 K P N i \ K DN f I K L LC PTM EG !JM D K Q:J : ; GNCOC IWHPFT 

o::ald:;ti' t k ivw :fgaylyac:;gtr /LnAt jgwwc'nuighghpyitkklceqaqklehv 

I i-'ANFTIIETAl .Kl -V: ;KLAPLLPlX;L£RFrF.jDNG.TT r ; I K IAMK F AVQYYYNQNKAK3HFV 
f ![ ;NAYI I' II V ["R ;AM; * [ AC IT' II "TTVPFUDLt LP'J.'ITTAAPYYGKr FLA [ AQAKTVFSESN I 

aak r yki -[ .[ .oi w v ;mi .mynpd \ lke l lklakhyc ;vi,f * r ade i f .tgf( irtcplfaseftd [ 

f 'I 'I J r [(.'I ;Ki UiT'f ^ ;YJ .l'f .Af .TVTTKF [ (ID \ FV'^OL' ) RMKALLI \< JltTrT^NPLGCaAALAGL 
l'I.Tl.:;i'l-;rUVK'jM I L-'Ri m.iF.r^nAHG';i.VVC t Pf , nVIj ITVLAL.tJYPAL/VrGYFSOYRDHLN 
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RFFLERGVTjLRPLGNTLYVLPPYC IQEEDLRI IY3[{L0DALGLCPC 

CPn_L042 1196629 ,1195914 

•faioD-dethiobiotxn synthetase 

NRSPFTYFRANFFMQRI I IVGIDTGVGKTIVSAILARALNAEYWKP IQAGNLENSDSNIV 
HELSGAYCHPEAYRLHKPL3PHKAA0 IDNVS I EESH ICAPKTTSNL 1 1 ETSCGFLSPCTS 
KRLQG DVFS SWS C SW I LVSQAYLGS I NHTC LTVEAM RS RNLN I LGMWNGY P EDEE HWLT 
Q/ETKLPT IGTLAKEKEITKTI rSCYAECWKEVWTGNHOG I QGVSGT P S LNLH 



bioF_2-Oxononanoate 3ynthase_2 

PMLCC^FLIEALARRKSK^m , RSLSLNSHLIDFT3NDYI J GFASSPEl.RKEYITKLHArES 
LGATGSRLLTGH SQ LCQR I EEQLAAYHNF ESCL I FNTGYTANLGLLYALATDQDR I LHDL 
YI HAS I YDG IRLSKAQSFPFNHNDLNHLEKRLASSHLGRTFVCVESVYS LHGSVAPLQAI 
SEI/TERYSAYLIVDEAHAVGVFGD^EGLVSAI^L£DK^ 

SIUCDYLINFCRPFIYTTAQPPHALTAIEI^YF^QRAF^QREHI^ALIHHFREKAQN^ 
LQLMKCNTTTPIQSICVSGSHRARQAALQIQNSGYDVRPIVS PTVKOREELLRICLHAFN 
TKNEIDHLLHTLEQ I FLCNVSSL 

CPn_1044 1198700 1197699 

•bioB-Biotm Synthase 

AKHMREETVSWSLED I RE I YHT PVFEL I HKANA I LRSNF LH S ELQTCYL I S I KTGGCVED 
CAYCAQSSRYHTHVTPEPMMKIVDVVERAKRAVEL^ATRVCLGAAWRNAKDDRYFDRVLA 
MVKS ITDLGAEVCCALGMLSEEQAKKLYDAGLYAYNHNLDSS PEFYETI ITT RSYEDRLN 
TLDWNKSG I STCCGG IVGMGESEEDRI KLLH^/LATRDH I PESVPVNLLWP I DGTPLQDQ 
PPISFWEVLRTIATARWFPRSMVRLAAGRAFLTVEQC/TLCEl^GANSIFYGDKLLT^ 
ND I DEDAEM I KLLGL I PRPS FG I ERGNPCYANNS 

CPn_1045 1199602 1198901 

•conserved hypothetical bacterial membrane protein 

GTLPMNTSHRKTLVFSYI^STFTIXLVLSNLVLSSKLIPTTFFNFIIPGGLILYPLTFLI 

SDVVNEIFGPKKARVMIFSAFIANLLASSIVQI 

AS LLAF IVSQQLD IVLYTFFKNRT PNS SLWLRSNGSTW I SQ I PDTF IVDTCILYFGMGLS 
FPC/TLNIMFYSYIYKITFCVLTTPLFYIAVOTIRKFLGMPSTKIAOT^ 

CPn_1046 1200675 1199590 

•Tryptophan Hyroxylase 

VHYCERTLDPKY ILK IALKLRQSLSLFFQNSQSLQRAYSTPYSYYRI ILQKENKEKQALA 
RHKC I S I I^FFKNLLFVHLLSLSKNQREGCSTI>IAVVST PFFNRNLWYRLLS SRF SLWK S 
YCPRFFLDYLEAFGLLSDFLDHQAVIKFFELETHFSYYPVSGFVAPHQYLSLLQDRYFPI 
ASVMRTLDKCNF SLTPDL I HDLLGHVPWL LHPSFSEFFI NMG RL FT KVI EKVQAL PS KKQ 
RIGTD2SNLIAIVRCFWFTVESGLIENHEGFJ<AYGA 

DQ I IRLPFNTSTPQETLFS IRHFDELVELTSKLEWMLDQGLLES I PLYNQEKYLSGFEVL 
CQ 

CPn_1047 1200537 1201343 

dapB-Dxhydrodipi col mate Reductase 

FGSRNMGSSMHVGVIGCSGRTGKVIVSALEQSSEYTLGPGFSRSSALTLFQVIAHNDVLV 
DFSHPLLTKEWAHLL I S PKPL I IGTTGF PGKCKEAHDSLEELTH I VPVWC PNASLGAY 
I HKRLVMLLSQLCNPQFD I R I RETHH RYKKDS LSGTAQDLIJTriCXJVKQEXWGEEYEVGQ 
RDS S KKTI EVQS S RVGDI PGEH EVAF I SSG EQ I LVRHTVF SRNVFGRG I LS I LDWLKTLK 
PQPGLYSI^DTLELVLRNEHCLLKKTTDH 

CPn_1048 1201588 1202604 

asd-Aspartate Dehydrogenase 
DGERKGMRIAVLGVTGLVGQKFVAIiHKWYRDW 

PMPEMVRDLPIRKI EEVQSD I WSFL PSSAESMEAYCLSQGKWFSNASTYRMHSSVP 1 1 
I PEVNSDHFQLLEEQPYPGK I ITSPNCCVSG ITLALAPLRKFSLDHVH IVTLQSASGAGY 
PGVPSLDLLANTVPH IVGEEEK ILRETVK ILGSSKQPLPCKLSVTVHRVPVAYGHTLSLH 
VTFSKDVDLDEI LYSYQEKNKEFPNTYQLYDNPWS PQARKHLSHDDMRVHLGP ITYGGDF 
RT I KMNVL IHNLVRGAAGTLLASMENYFFDYLKREMCLR 

CPn_1049 1202586 1203914 

lysC-Aspartokinase III 

EG>A/SK IVYKFGGTSLATAEN I CLVC DI I C KDK PSFVW S AI AGVTDLL VDFC SS S LRER 
EEVLRJCIEGKHEEIVKM^IPFPVSTWTSRLLPYLQHLEISDLDFARILSLGEDISASLV 
RA VCSTRGWDLG FLEARSV I LT DDSYRRAS PN LDLMKAHWHQLELNQ PS Y 1 1 QGF IGSNG 
LG ETVLLGRGGSDYSATLI AELARATEVRIYTDVNG IYTMDPKVI SDAQRI PELSFEEMQ 
NLAS FGAKVLYP PMLF PCMRAG I P I FVTSTFDPEKGGTWVYA VDKSVSY E PR IKALSLSQ 
YQ S FCS VDYTVLGCGGLEE I LG I LES HG I DP ELM I AQNNWG FVMDDD 1 1 SQ EAQEHLVD 
VLSLSSVTRLKHSVALITMIGDNLSSPKVVSTITEKLRGFCGPVFCFCQSSMALSFWAS 
ELAEG I IEELHNDYVKQKAIVAT 

CPn_1050 1203834 1204798 

dapA-Dihydrodipicolmate Synthase 

LCKTKSYSRHVGRrMHLLTATVTPFFPr^GTIDFASLERLLSFQDAVGNGVVLLGSTGEGL 
SLTKKEKQALICFACDLQLKVPLFVGTSGTLLEEVLDWIHFCNDLPISGFLMTTPIYTKP 
KLCGQILWFEAVLNAAKHPAILYNIPSRAATPLYLDTVKALAHHPQFLGIKDSGGSVEEF 
QSYKS I APH IOLYCGDDVFWSEMAACGAHGL ISVLSNAWPEEAREYVLNPQEQDYRSLWM 
ETCRWVYTTTNPIGIKAILAYKKAITHAOLRLPLSIEDFDLENVSPAVESMLAWPKLRTS 
VFSYS 

CPn_105i 120-1^56 1205270 

No robust homo Log present m Genebank/EMBL as ot 11/7/98 
FFMTPKSIOQLHLIKTIDPVRKISPVTTKKSSrFRQSLLRFLELFWMFLYC URSIRFHCV 
HIATFICRGLILFLTTLFLSMICILHFITLPWICKEDPRI IRKNK 

CPn_1052 L205402 120blb f ) 

No robusr homo lot; ptes^nr m Genebank/EMBL .is ot U/7/9R 
FFIQKMKYNSREKIK^ALR ICSC/fC ITVFRNrjr. f ;L^CYDKI F ^IJLJCYVFNtiPNS [CRCR 
( ;prFFRGKKTEVCTKFVKrKDErPPlTLCGNDr'VKVAESFPKRHAALE:>Li;SO::^ IGNLCA 
r';NFLD*,OMLi'.RNr::KK[WG'rriFTR 3K'JT('DAEG:JEPFRYTAC( ^YUV ILRl^KLAGliYEL 

GVTAGLLC<:RLKDV:;D:;nRTRvr';;;rL::viiG-;MvrRpL:;r'TKY ivgkar pllfpfrltsd 

VRPOLKKKTRLi JTKD 

' !ti_L(J'- ' 1 Ji'r I -M I .'()r, /o 1 

No robu.r hoinolixj [Mr 1 r-nr in ' ;. mh -U ink / KMBL v. ot Ll/;/''H 

KK J.'HJLrJt-AK LlJllNHLYLTc UJIjI/IVAf "I ■ 1 L. .TI X l.PNY.'^E IKA.^HCVLVY* *K FRf ' t r JGEP 

:i'Lat';gndtyy:: [V:M.i'rt^,i< /rvT:;i j ';GRMi>FNi dmhvapk n tavl::ih rrRF^KcrPG 

^jrOYArFSLTAREilLMI ILKLAMTr-ijV'IE V I tjNCT'^ 'TKVTKTNLKEyYRHr, IMNTGF 

el:;vk.';af 



121 



<;Pn_IOVl L2070KJ 120146* 

Mo robuur. homolog present in Genebank/EMBL as of il/7/98 

SR WI fHRF [ MQVLLSPQLPP P PQHSVGS I SSPSKLRVLAITFLVFCMLLLI 3GALFLTLG I 

PGLSAA ISFGLG IGLSALGGVLMI SCLLCLLVKRE I PTVRPEEI P EGVS LAPS EEPALQA 

AOKTI^OLPKELDOI^DIOEVFACLRKLKDSKYESRSFLNDAKKELRVFDFWEDTLSE 

I F ELRQ I VAQECWDLNFL INGGRSLMMTAES ESLDLFHVSKRLCYL PSGDVRGEGLKKSA 

KRrVAPr.M.^r.HCErHKVAVAFDRNnYAMAEKAFAKALGALEESVYRSLTOSYRDKFLESE 

- - v \\\v\-y.\ .-LbAr ■ .< ■■\::Ki / LiM J ANi.'<wK!- : nr.w ' witl- r , c :j :;jnllgdwct 
' i-:i'Mij:: r rH'i :,*■; KT'T-r.^'U'w '-\i,/ *''~"~ f > - P'TKNro' /exanappli-' 

YV RDWY DQ EFQKAG ERLEKLHALY P EVSVS I R ENK I Q ET R3NLEKAY EA I EENYRCCVRE 
OEDYWKEEEKREAEFRERGNKI LS PEELESSLEQFDHGLKNFSEKLMELEGH ILKLQKEA 
TAET\/ENKILSDAESRLEIVFEDVKEMPCRIEEIEKTLRMAELPLLPTKKAFEKACSOYNS 
CAEMLEKVKPYCKESLAYVTSKERLVSLDEDLRRAYTECQKRFQGDSGLESEVRACREQL 
RER IQEFETQGLDLVEKELLCVSSRLRNTECDCVSGVKKEAP PGKKFYAQYYDEIYRVRV 
QSRWMTMS ERLR EGVQACNKMLKAGL S EEDKVLKEEEYWLYR EERKNKEKRLVGTK IVAT 
QQRVAAFES I EVPEI PEAPEEKPSLLDKARSLFTREDHT 

CPn_1055 1209583 1210521 

No robust homolog present in Genebank/EMBL as of 11/7/98 
CKYLYHHSYPPPPDHSVGAFFCLSKFRVLAITFLVLGVLFLISGALFLTLGISGLSAAIS 
rGL^rGLSAI^VLWSGLLCLIAKREVPTVRPEEIPEGV^APSEEPAUJATQKTLAQL 
PKE^TOLDRYIQEWSCLGKLKDLRCEDQGLIJKDAKEKLQVFDFVWK^>^K^EFVEI^IM 
DQ EGWYLKC L 1 0 EMRD IG STLFMSQV S LF KLWEWLGYLPSGDVRG ERIJOCS AREVVDRFM 
R R I C DT RKVAMTFDRNAYGVAKTAFEKAFGAL ETCVYKSMT ESYREAFC EYKKTK I LRDE 
EK I LR I CYLELRR 

CPn_1056 1210482 1211228 

No robust homolog present in Genebank/EMBL as of 11/7/98 
GEDIKDMLSRV^EIEMMLRVIELPI^PIKQAL 

TSEERLQSL DQTL ERAYKEYQ KRFQ EPSRLESEVSGC REHLR EQVKQ FETQGLDL I KEEL 
I FVS DVL F R KMVSC LVSTVHVP FME FYY EY F EL HPXRLRAGWMAWAE I Y SKVRKAF PEML 
KETLEKAKAPREEETYWLI^EERKSKEKRLII^KIEAAC^RVKDLEPPPIKETGKOK^ 
YSFFIRLKS 

CPn_1057 1211467 1213596 

CT356 hypothetical protein 

1 1 H FYF FNFAMP EPLYTNKL IT EKS PYLLLYAHTPVNWYPW3AEAFH I AAI ENKPVFLS I 
GCKHSRWCQVMLQESYTNP EIAAMLNEYFVNVKVDKE EL PYVAKLYGDLAQMLAVSGDHQ 
ETVSWPLNVFLTPDLVPFFSVNYLGNEGKLGLPSFPQII DKLHFMWEDAEEREALVEQAM 
KVLEIASFLEGCVRKEIUDESSIJaVIVAALYQDIDPHYGGVKA^ 

L^YQES RGLF FVDRS L SMVALGGVRDH IGGGVYSYT IDDKWL I PAF EKRLI DNALMALNY 
LE7WACLGKEEYRGIGKQILSY ILSELYS PEVGAFYSSEQAENWGAGGQNFYTOSVEEIS 
NA3X3EDAEIFCDYYGISREGFFNGRNILHI PVHREIEELSEKYHRS IEAIEDIVDRSRDI 
LKGllRAQP^HRSKDDLSLTFI^GV^IYTFAYAGPJ^EVEYIEIGKKCGEFVRNSLYKHH 
ELi^RRWREG EAKYRAS L ED YGAL I LG VLALY E SGCG S FWLS F AEELMQ EWLS FRS EEGG 
FYdSVDGRDSTLL IKQSPLSDGET I SGNALICQCLLSLHL ITEKKHYLTYAEDILQIAQAC 
AK!iMcFSSIXjLLIA5QOTFSPJCHVKVLIA1XjDQEDRSPVIJCCLSGL 
QE#L£TVLPEYEHCLI PKGDCTATT I YVLEVDQCKRFKDLELFRRYLISL 



CPTU1058 1213742 1214836 

CT!3"5T5 hypothetical protein 

EV^RLYQTLRG I VLVS TGC I FLGMHGGYAAEVPVT S SGYENLLESKEQD PSGLAIHDR IL 
FK^^ENWTALDVIHKIJSILLFYNSYPHL ^ 

AKA&fcl AT D PTAVNQE I EEMFGRDLS PLY AHF EMS PNDI FNVI DRTLTAQRVMGMMVRS K 
VMLKVT PG K I RE YYRKLEEEASRKV I WKYRVLT I KANT ES LASQ I ADKVRARLNEAKTWD 
KDRLTALVI SQGGOLVCSEEFSRENSELSOSHKQELDL IGYPKELCGLPKAHKSGYKLYM 
LLDKTSGSIEPLDVMESKIKQHLFALEAESVEKQYKDRLRKRYGYDASMIAKLLSEEAPP 
LFSE3L 

CP£fi059 1214843 1215678 

k ST § Ar- D ime t hv 1 adenos i ne Transferase 

vtrsbpaqlsrflseiqnkpkkslsqnflvdqnivkkivatsevt pqdwvleigpgfgal 
teem aagaqvi ai ekdpmfapsleelpi rle i idacky pldqlqeyktlgkgrwanlp 
y h 1 tt plltklfleapdfwktvtvmvqdevarr i vaq pgg rdyg slt i f lqffadi hyaf 
kvsfiscfypkpqvqsavi hmkvketlplsdeei pvfftltrtafqqrrkvlantlkglyp 
keq^qalkeu;lllnvrpevlslndylalfhkmqag 

CPni; : i060 1217694 1215727 

dxs/tkt -Transketolase 

ykrflyihitkvmtssscplldlilspadlkklsisqlpglaeeiryriisvlsqtgghl 
ssnlgivelt ialhyvfss pkdkf i fdvghqtyphklltgrnnegfdh irndnglsgftn 
ptesdhdlffsg h agt als lalgmaqtt p les rthv i p i lgdaafs cg ltl ealnn i st d 
lskfwi lndnnms i sknvgamsr i fsrwlhh patnkltkqvekwlak i prygdslakhs 
rrlsqcvknlfc ptplfeqfglayvg pidghnvkkl i p 1 lqs vrnlpfp ilvhvcttkgk 
gldqaqnnpakyhgvranfnkresakhlpaikpkpsf pdifgqtlcelgevssrlhwtp 
amsigsrlegfkqkfperffdvgiaeghavtfsagi akagnpvicsiystflhraldnvf 

HDVCMQDLPVTFAIDRAGLAYGDGRSHHGIYDMSFLRAMPQMIICQPRSQWFQQLLYSS 
LHWSSPSAIRYPNIPAPHGDPLTGDPNFLRSPGNAETLSQGEDVLI IALGTLCFTALSIK 
HQLLAYG I S ATWDP I F IKPFDNDLFSLLLMSHSKVIT I EEHS I RGGLASEFNNFVATFN 
FKVDILNFAIPDTFLSHGSKEALTKSIGLDESSMTNRILTHFNFRSKKQTVGDVRV 

CPn_1061 1217932 1217666 

CT330 hypothetical protein 

FGSLMVEIHHKDP3LKKLFALQQSLETLNSLSDIVATYEAMFSLIYEGLNKALRKDQLCY 
LLSVNSKGELLK3PSGDPIVQTFPIHPHH 

CPn_10b2 1219835 121315^ 

xseA-Exodoxyribonuc lease VI I 

rgfpvm:;gppqavaslter e kt llecn fcq i 1 vkg elonvslq pfjg h l y fg i kdsqafln 
<:affhfk:;kyydrkpkdodavi ihgklavyaprgq (Qivai ialvyagegdllqkfeetkr 

Kf.TAEGYFATEKKKPLPFAPQC IGVTTSPTGAVEQDILRVLGRRARNYK I LVYPVTVQGN 

: ;aai i e r r;KA r evmnaenladvli t arggg^ : edlwafnee r lvka t hast r p ivsavghe 

■r[)YTU:ijFA::DVi^PTP':AAACrVCKJ.OEEQVQVFEGYLRHLL^Ii:jROLLTGKKQt!LLPW 
Km-UDRAEFYTTAQOOLDS IETAIQKCVQCK IHE3KQRYDN r'JRWLQGDLVSPMTCRLQS 
ia<KMLJrjAL:'nKAL::LOVRCl(QLKKr;LTYPROIOOA r ;QKL:;i>WRQQLDTLI^PRLHYQKE 
r-:YiniK[)TRI,KHAHWr,E(.\jLR^ilVOKLCLU;prvX';prrELNLONOK[A\/WKCTLATIL 
l-:HKYEN.';VARY:;ALKEOLH.;LNPKNVLKRc;YAMLrDnJEN:JAMi:JVDSLOEHARVRI'jLQ 
IXIEAILTVTN EE [<7KL [KG 



tpLJ-TrLOsephospnate Esomerase 

FCRE3MRIKFRENKERKMTRQGYVLGNWKMHKTIOEAKE\*\'C FLA J LLQGEPL7C7 IGt A 
3 P FTG LRA I H EV I NTTGAF LWLGAQNVH P ELSGAFTG E I JL P M LK CVGVE FVL7GH S ER R 
H I FGESDAF I ASKVKSVAQACLVPVLCVGESLEVREECKAHQV I KKQLLLGLECMDNGSE 
FL I AYEPVWAICTGKVAEASDVQD I HMFCREWAER FS EATAEE 1 3 I LYGC5VKVDNAQR 
FGCCSDVDGLLVGGASLECQSFFEVAKNFNV 



CPn_lO*4 



122071^ 12208^5 



CPn_1065 1221140 1220928 

No robust homolog present in Genebank/EMBL as of 11/7/98 

RHRLGRHRRTSDPCFLFYF3IPEESLPPDSCRLNQMPKHEHLPSILLKKPIIDYLKITSI 

YEKAIFNTGLP 

CPn_1066 1221132 1221488 

No robust homolog present m Genebank/EMBL as of 11/7/98 
SMSLNKEIGMTVLFYAFLF IFLFLCVILCGLI LVQESKSMGLGSSFGVDSGDSVFGVSTP 
DILKKVTSWCAVAFCIGCLIXSFSTNLLGKKLDA^ 

CPn_1067 1221675 1222292 

def- Polypeptide Deformylase 

IQVLWRDFFTELCQAHVQTMI RRLEYYGS P I LRKKSS P I AE ITDEI RNLVSDMCDTMEA 
HRGVGLAAPQVGKNVSLFVMCVDRETEDGELI FSES PRVF INPVLS DPS ET P I IGKEGCL 
S IPGLRGEVFRPQKITVTAMDLNGKI FTEHLEGFTARI IMHETDHLNGVLY IDLMEEPKD 
PKKFKASLEKIKRRYNTHLSKEELVS 

CPn_1068 1223267 1222365 

rnhB-Ribonuclease HII 

MSCMPPPFVVTLTTSA0NNLRDOIJ«CEKNFIFSQPQNTVFQARSNTVTCT 
KGSEEFIEFFLEPEILHTFTHARVEQDI^PRLGVDESGKGDFFGPLCIAAVYASNAEILK 
KLYENKVQDSKNLKDTKIASLARI IRSLCVCDVI I LYPEKYNELYG KFONLNTLLAWAHA 
WINMJVPKPAGDVTAISDOFAASEYTIXKALQKKETDITLIOKPFIAEODVV^ 
RDAFVQS IQKLEEQYOVQL PKGAGFNVKAAGREI AKQRGKELLAK I SKTHFKTFDE ICSG 
K 

CPn_1069 1223507 1223941 

yfgA-HTH Transcriptional Regulator 

VIMQEHrHKELLHLGEIFRSSRESQSLSLKDVEAATS IRYSCLEAIEQGCLGKLI SPVYA 
QGFIKKYATYLGLDGDSILQEHPYVMKIFKEFSDHNMEKrT.nLESMGGRNSPERAIHSWS 
NLWWAGLI I IGG IMVWWLGSLF S I F 

CPn_1070 1225523 1224144 

No robust homolog present xn Genebank/EMBL as of 11/7/98 

RR SLMTFPCGNCNCYYRET P PPNPGGEDI PLQ EGGQ SGSQGGRVITQQPGTGGREMG I SL 

GSDhJVtXSMVEQAGSLLNNLLDSAPJ^ORIGHYCYRTGTPWCREHC 

ETVDD PDNP S AQ F UQQL I Q Q YG P I CVGMS FQQLPHCTQ K I EQGEP LG DG DKQ EVENGC KL 
HRELLJ<AAQPRC>K3ESLVKIJ^N^LGEDMCX2TPPWSLII^ 

WILQPEQQPCPPPPT DEEQLQG AVGG AP A PQQKKH P AQ EC RVTC KLNFRTLLQKLS RLEV 
LSLESGYXGPLGQAAKQIVDLIKKSLKRLVASDLATFLGPGIGLSLESQVPEVLVLLCLL 
SKGYLPLDPLHPEC/IVI^PRVCX?PWQRILRKVLVTTTAGENITO 
WD DDE I ERDG IVTGGG FG I PCQC LRCWRKLPTEKRPNRWL 

CPn_1071 1227336 1225885 

No robust homolog present m Genebank/EMBL as of 11/7/98 
KGTTMVC PNNSWFRMC GNF NC EWVEVTTT EETT RQS AS D I S E EAG S SGG AA P ITTQ PTK I 
TKVEKRVQ FNTAQGDEST I HM I QEAG ELVDS I LSHRRTQGCT EYC Y DS Y ATGCGQRCGS F 
GRL ICGTYKACCLDRE DNQVAGLVHECEQTHG P I AVALAAKTMGLNLMELVEKNT I LSEE 
QKNEFRQHCSEAKTQLYGTMQSLSQNFFLEGVNSIRERGLDDSLVQAVLSFIATRSWEKT 
IESEEASGTSSASNSTRI PACY I LNTSPLTTSRLSCGSRDARRPSSVGAEPQ YVAKKYND 
NGMARQE^KIQVTNLKTGDFSALGPFGLLIVKMI^SFXLSASQSTSSILKHTGGEICYTC 
PN FRD I WLLMLA IGYC PANTDETSWD I HM I DDP IMT I FYRLQY S YRTGKTS AS FLKKK 
PSLVRQESLDCPTPAESVPLMSSLEEEDENEDDDEDGNLAYQQRILECSGHLQTLFLGIK 
INKE 

CPn_1072 1227924 1228835 

No robust homolog present m Genebank/EMBL as of 11/7/98 
KKDYILHANVTCCWKQMLXIQKKRMCVSWrTVGAIVGFFNSADAAPKKKKIPIQILYSFT 
KVS SYLKNEDAST I FCVDVDRGLLQH RYLGS PGWQETRR RQ L F KS L ENQSYGNERLGEET 
LAIDIFRNKECLESEIPEQMEAILANSSALVLGISSFGITGIPATLHSLLRQNLSFQKRS 
I ASESFLLKIDSAPSDASVFYKGVLFRGETAI VDALSQLFAQLDLS PKK 1 1 FLGEDPEW 
QAVGSAC IGWGMNFLGLVYYPAQESLFS YVHPYSTATELQEAQGLQVI SDEVAQLTLNAL 
PKMN 

CPn_1073 1229011 122^832 

Predicted OMP (CT37 1) 

MRRYLFMVLALC LYRAAPLEAWI K ITDAQAVLKFAREKTLVC FN I EDTWF PK'QMVGQS 
AWLYNRELDLKTTLSEEQAREQAFLEWMG ISFLVDYELVGANLPJT/LTGLSLKRSWVLG I 
SQR PVH L I KNT L R I LR S FN r D FT SCP A I C EDCWLS H PT K DTT F DQ AMA I EKN I LFVGS LK 
NCQPMDAALEVLLSGI SSPPSQ I IYVDQDAERLRS IGAFCKKANIYFICMLYTPAKQRVE 
GYNPKLTAIQWSQIRKNLSDEYYESLLSYVKSK 



RNA SECTION 



rmRNA I l'M-> i I 1^074 

ruljfinih I.-, i t- RN \ t,() / i!j *>0/'/4'> 

E ,J - ■ l-NA [OOO'^.A 11)021 I ' 

/ r: r I'NA 1 0(1. M IS 1 ()()',.' / h 

r UNA I 00'. •/) i \ <][)' ,',()'* 



12r»'H)0 122071 
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tRNAs 



CRNA # ^egin 





L 




89657 


2 


90998 


; 


i ■ >■> : ! " 
296075 


6 . 


296151 


7 


409848 


8 


462141 


9 


672236 


10 


677264 


11 


739403 


12 


781610 


13 


784822 


14 


784922 


15 


836119 


16 


843926 


17 


877400 


18 


1085605 


19 


1142034 


20 


1175863 


21 


1230028 


22 


1137462 


23 


1030603 


24 


1000022 


25 


961607 


26 


807413 


27 


786780 


28 


715971 


29 


709441 


30 


680259 


31 


631445 


32 


626987 


33 


293477 


34 


293399 


35n 


269142 


36 ":: 


269065 


37 w ;i 


164389 


38 - B . 


87522 



End Type Codon 



89723 Thr CGT 
91070 Trs CCA 



29bl47 


val 


TAC 


296224 


Asp 


GTC 


409922 


Pro 


TGG 


462214 


Arg 


CCT 


672318 


Leu 


CAA 


677337 


Arg 


TCG 


739486 


Leu 


CAG 


781680 


Gly 


TCC 


784896 


Glu 


TTC 


~L ^ 


^ s 




836 191 


Ala 


GGC 


843 999 


Pro 




877473 


Arg 


AC~ 


iUODO / 0 




TTG 




Ser 






Leu 


TAG 




Ser 


CGA 


1 137389 


Val 


GAC 


1030533 


Cys 


GCA 


999949 


His 


GTG 






GCC 


807341 


Arg 


TCT 


786708 


Thr 


CGT 


715889 


Leu 


TAA 


708354 


Ser 


GOT 


680178 


Leu 


GAG 


631373 


Phe 


GAA 


626901 


Ser 


GGA 


293405 


Thr 


TGT 


293317 


Tyr 


GTA 


269070 


Ala 


TGC 


268992 


He 


GAT 


164318 


Asn 


GTT 


87450 


Met 


CAT 



What is Claimed is: 

1. An isolated nucleic acid encoding a C. pneumoniae protein as set 
forth in Table 3. 

2. The isolated nucleic acid of Claim 1, wherein* said nucleic acid has 
a nucleotide sequence of an open reading frame in SEQ ID NO: 1 . 

3. A probe comprising a hybridizing fragment of an isolated nucleic 
acid according to Claim 2. 

5. An isolated nucleic acid that hybridizes under stringent conditions 
to the nucleic acid sequ jnce of Claim 2. 

6. P ji expression cassette comprising a transcriptional initiation 
region functional in an expression host, a nucleic acid having a sequence of the isolated 
nucleic acid according to Claim 1 under the transcriptional regulation of said 
transcriptional initiation region, and a transcriptional termination region functional in said 
expression host. 

7. A cell comprising an expression cassette according to Claim 6 as 
part of an extrachromosomal element or integrated into the genome of a host cell as a 
result of introduction of said expression cassette into said host cell, and the cellular 
progeny of said host cell. 

8. A method for producing a C. pneumoniae protein, said method 

comprising: 

growing a cell according to Claim 7, whereby said C pneumoniae protein 
is expressed; and 

isolating said C. pneumoniae protein free of other proteins. 
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9. A purified polypeptide composition comprising at least 50 weight 
% of the protein present as a C pneumoniae protein comprising an amino acid sequence 
of claim 1. 

10. A monoclonal antibody binding specifically to the polypeptide of 

Claim 9. 
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ABSTRACT OF THE DISCLOSURE 



CHLAMYDIA PNEUMONIAE GENOME SEQUENCE 

C. pneumoniae genome sequence and analysis of the encoded polypeptides 
5 and RNAs are provided. The C. pneumoniae gene nucleic acid compositions find use in 
identifying homologous or related proteins and the DNA sequences encoding such proteins; 
in producing compositions that modulate the expression or function of the protein; and in 
studying associated physiological pathways. In addition, modulation of the gene activity 
in vivo is used for prophylactic and therapeutic purposes, such as identification of cell type 
10 based on expression, and the like. 

") SF 1040660 vl 
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^ 



Contig463 



Length: 


273254. . 










1 


ATTGTTCCTG 


■ TAAGAACACT 


TCCAAAGCGC 


ATTTAATPAT 

■c\ x x x nn j. v^rt l 


TTTTA PTA Zi A 
1111 Jr\\3 1 r^i-iri 


51 


AAATAAAAAT 


ATACTTTTAA 


ATGTTGAGAA 


AATTTTTAGG 


TA A APTTTAT 


101 


AAAGGGTTGT 


TGGTGAAACC 


TTTGGGTTAC 


TCCTGAGAAr 


GAPTTTPTP A 

OrAV^ x 1 1 o 1 \3s\ 


151 


TTCTATAGTA 


TTAAAAGGAT 


CTTGGAGTAT 


AACAAGTAAA 


GATPTTTGAG 


201 


GATAGCGTAG 


GGCCGTATTT 


TGAATAGCGT 


CCAATAAAGG 


GPGTTTGPA A 


251 


AACGCTTGAG 


TTTGGTTGTC 


CCAATAGAAA 


GTGPPTTPTT 




301 


CTCTTCTGGA 


GGCACTTCAT 


AGACCGAAGT 


AAAGAGAGGA 


AGAGPAAPGA 


351 


TTGCTGCATG 


ACTTTCTATA 


GCTGCTTTAA 


GGPAGTTPTP 


GT A PPPT APT 


401 


AAAGCTTGGC 


GATAATATTC 


TTGCTGATTT 


GGT A APTPTT 


PAPA TTT a PP 


451 


GCCGCATACG 


TGGCCTAAAA 


AGGTPGGAAG 


AATArTTTTP 
s±t\ l /iC 1 1 1 1L 


1 i i ILi CjtL AG 


501 


AGCTTAAATT 


TAGATTAAAC 


fiTTTHA TPTA 


vxttbiv-. 1 iLbl 1 


1 boAAbi 1 1 i 1 


551 


ACTACTCTCA 


CTTCGGTAGG 


ggaaaagggg 


mprnrppmpprpm 


1 1 oL-(jooALL 


601 


CCCTTCGCGT 


TGCTTGCATG 


tatpppapap 


1 i 1 inl^i 


111 /ibjOO 1 


651 


AGTAAAGGAT 


AGTAGAGAGG 


TTCGTTGPAG 


TGTTPTPPAT 


PAPA TTPPTT 


701 


GGGCCTACGG 


GATTAAAGAT 


GATPPPTGTG 


G A TTP, A TTTT 1 


ill L-oAi lal 


751 


TCTAAGTCCA 


GTTAAGAAAG 


TAGGPTGAAA 


tppttp ar ap 


PPS r PP r P/^ ,r PT"Ti 


801 


GTATCGCTAC 


CTTGAACTTA 


GGGTTCAGGT 


PATT ATTPT A 

url x 1 .rt 1 xol-rt 


A A Arpipr , P7t f r i P 
nAni IVjjL-AIL, 


851 


TCGTTTGAGT 


AGCAGTCTAC 


GTTTTTTTPT 

OXXXXXllV^,!. 


J. bLLnLoU X X 


1 i LLCAAAbjG 


901 


CTTGAAGTTT 


TGCTCTAGAA 


GTTTPTnPP A 
^-11 iLllj^^rt 


oil AoAAbrA i 


ALC 111 GAGG 


951 


TCATTTGGTG 


GTAGAPTAAG 


AAGGTTAPA A 


^ 1 b/ibAfttjAb 


VjCjLCCjiGGIA 


1001 


ATGAGAAGAG 


C C AAAAAT AC 


AGGGTTPPPT 




1 1 AAAb Abj A 1 


1051 


TCCAGCCACC 


AAAGCTCCTA 


AAGPTAAAGA 


APPT AP.P AT^r 


PPA APAPTip/^ 1 


1101 


ATATTTTTGC 


TATGGTAAAC 


TGTTTTTTAG 


GAGCAATTTC 


TTTATCCCGA 


1151 


GGCACATAGG 


ATAGTACAGA 


AACTTGAGAG 


CTCTCAGTAC 


GTGAGGGTCC 


1201 


TGACATAACA 


TTTTTTTTGT 


AAAATACTTT 


CTATAATTTT 


AACATATTTG 


1251 


TGTTTATCGA 


TCCGAGAAAA 


TTGGAGAGTG 


AGAGCGCATG 


TCTTGCAATT 



13 01 TAGAATGATC GGGGACGACA TCTAGAGCTA TGTAGACATT GCGTGCGTAG 

13 51 TGGGAGCAAA TATAGCGAGA TATAAAGTAT AAGGGAATTG CTGTTAGGAA 

14 01 GATAAAGGAG CACAAAGGGT GGATACATAG CCCAATAGCT ATGGTGGTAG 
1451 CAATCAGAGC TATCCAGACG AGTGCAATCG CAATAGTAAC GAAGAGGGCA 
1501 AGCTTGAAAT TATAGCGAGG ACGAGTAGCT GGGGGAAATA GAGAGGGAGC 
1551 CGTTCCATCA AAACCGGGAG TAGCTGAAGA AGCCATAAAC TATTAAAAAT 
1601 TAAGTTTTTT TCGGAGCATA AAGCATTTTA AAGTAGTGGG GTCTTTTTTG 
1651 TCACGGAGAT GTCCTGGACT TCCCAAGCGT TTCTAACAAA GATACCTGCT 
17 01 TTTGAGAGGA GAACTTTTGA AACTCCTGCA AGGTCATCCT TCCTTGGCAC 
1751 CAGTAGGTTT TTTCAGGAAA TCGCGGAAAG ATTTTGGCGA AAGCTCTTAC 
1801 AGTTGAAGGG CTTGTGAAGA TAATTTTTTT GTATTTAGAT AAAATATTTT 
1851 TTTTAAGTTT TCGCGGCTTC ACTGTGTAGT GAGGGTAAGA GAAAAAAGTA 
19 01 AATCGATTGT AAAGAAATTC TCTGATCACA GGTCTTGCGA GGGAGGAGTG 
1951 GGGGTAGAGA ATGCGGGCTG AAGAGGGCAG TGCCTGTAGC AATGGGAAGA 
2001 TGCCTTCAGC GATTTCTTGA GTTGCTACTA CGTACTTCAC TTGTCCAAGG 
2051 AAAGAGAGAA GTCTTTCTTT GGTGGACTCT CCTATACAGA GGTAGGTCTT 
2101 TGTTTTTAGA GTGGCCTTAG AAAGAAGAGA AGTCATTCTG GAAAGGAATA 
2151 GGTGAGTGGA TGAGGGACTT GTGAGAATCA CATGGGTTGC TTGTGGAAGG 
2201 AATTGAAGAG CACGCTTATT TTGTGGAGTG CTTTTTGCAT AGGGGAAGAG 
2251 AGTTAGAATA GGCAAATAAT GAGCTTGGTA TTTACGAGCG GTTTTTTGAT 
23 01 TCAATCCTAA GTAGAGGGTC ATGAGTCTTT TCGGGTAAAA GGAAGGCTGC 
23 51 CTAAGTTTTT GTACCTTCAA AGGGATATAT TGAAAATAAT TTTTCTTTTT 
2401 CCCTTGGTTC TTCTTGATCA TGCGTTGATT GACATTTTTC ACTTTGAAGG 
2451 CTAGGCTGGT TTTTTCTGGA CTTAGAGGTT CTCTCTATTA AGGCTTCGTC 
2501 TTTAGAAGTC CTTGCTAAAA GTTTTTGAGA AATTTAAGAA ATTCGCAATA 
2 551 GTGGAAATAT TTACAAAGGT GGTTGCGGTG GTTTCGTTGT TGCATAAGTT 
2601 TTTAGAAAAT GCTTCGGGGA AAAAGGGACA AAGTTTAGCT TCGACAGCGT 



2 



2 651 ATTTAGCAGC TCTTGACCAT CTCTTAAATG CGTTTCCTTC CATTGGGGAG 

27 01 AGAATCATTG ATGAGTTGAA GAGCCAGCGT TCCCATTTAA AGATGATTGC 

2751 TTCTGAAAAC TATTCTTCAC TTTCAGTGCA GTTGGCTATG GGGAACTTGC 

2 801 TCACAGATAA GTATTGTGAA GGAAGTCCCT TTAAGCGTTT CTATTCCTGT 
2851 TGTGAAAATG TAGATGCTAT TGAGTGGGAG TGTGTAGAGA CAGCGAAAGA 
29 01 ACTTTTTGCT GCGGATTGCG CTTGTGTTCA GCCTCATTCT GGGGCTGATG 
2951 CTAATTTACT GGCAGTAATG GCCATTCTCA CGCACAAAGT CCAAGGCCCA 

3 001 GCTGTCAGTA AGTTAGGTTA TAAAACTGTA AACGAATTAA CAGAAGAAGA 
3 051 ATACACTCTA CTTAAGGCTG AAATGTCTTC TTGTGTTTGC TTAGGACCTT 
3101 CATTAAATTC TGGAGGCCAT TTGACCCATG GGAACGTACG TTTAAATGTG 
3151 ATGTCTAAGC TTATGCGTTG CTTCCCCTAT GATGTCAATC CGGATACGGA 

32 01 GTGTTTTGAT TATGCAGAGA TCTCCCGGTT AGCTAAGGAG TATAAACCTA 
3251 AGGTACTGAT CGCAGGATAT TCTTCCTATT CTCGAAGATT AAACTTTGCA 

33 01 GTTTTAAAAC AGATTGCAGA GGATTGTGGA TCTGTCTTGT GGGTAGATAT 

33 51 GGCGCATTTT GCAGGCCTAG TTGCTGGGGG AGTGTTTGTT GATGAAGAAA 

34 01 ATCCTATTCC TTATGCAGAT ATAGTGACAA CAACAACGCA TAAGACATTA 
3451 CGCGGTCCTC GCGGGGGATT AGTTTTGGCA ACTCGAGAGT ATGAAAGCAC 
3 501 TCTCAATAAG GCGTGTCCTT TGATGATGGG AGGTCCTCTA CCTCACGTGA 
3551 TAGCTGCTAA AACAGTGGCT TTGAAGGAAG CTCTCTCTGT GGATTTCAAG 
3601 AAATACGCTC ATCAGGTTGT AAATAATGCT CGTCGATTAG CAGAGAGATT 
3651 TTTAAGTCAT GGGCTACGTC TTTTGACGGG AGGAACAGAC AACCACATGA 
37 01 TGGTGATTGA TTTAGGTTCT TTGGGCATTT CTGGAAAAAT TGCTGAAGAT 
3751 ATCTTGAGTT CCGTAGGAAT TGCTGTGAAT CGGAATTCAT TACCTTCAGA 
3 801 TGCTATTGGT AAGTGGGACA CTTCAGGTAT ACGTTTAGGA ACCCCTGCAC 
3851 TAACGACTTT GGGTATGGGT ATCGATGAAA TGGAAGAAGT TGCAGATATT 
3901 ATTGTGAAAG TATTGCGAAA TATTCGTTTA AGTTGCCATG TTGAAGGGAG 
3 951 TTCTAAGAAA AATAAAGGGG AACTTCCTGA AGCCATAGCG CAGGAAGCTA 



3 



4 001 GAGATCGTGT TCGCAACTTG TTGCTGCGTT TCCCGCTCTA CCCTGAAATT 

4051 GATTTAGAAG CTTTAGTTTA GTTAGGAGAG ACATTATTTT ATGGCAGACG 

4101 GGGAAGTTCA TAAATTACGT GATATTATAG AAAAAGAGTT ATTGGAAGCG 

4151 CGCAGAGTAT TTTTCTCAGA GCCTGTAACA GAGAAAAGTG CTTCCGATGC 

42 01 AATTAAAAAG CTTTGGTATT TGGAATTAAA AGATCCTGGA AAGCCTATAG 

42 51 TTTTTGTGAT CAATAGTCCT GGGGGATCTG TGGACGCAGG TTTTGCTGTT 

43 01 TGGGATCAAA TTAAAATGTT AACCTCACCC GTCACTACTG TTGTGACAGG 

43 51 GTTGGCAGCT TCTATGGGCT CGGTATTGAG TTTATGTGCA GCTCCTGGAA 

44 01 GGAGATTTGC AACTCCTCAT TCTAGAATTA TGATTCATCA ACCTTCAATA 
4451 GGTGGACCGA TTACCGGTCA GGCAACCGAT TTAGACATTC ATGCGAGAGA 
4501 GATTTTAAAA ACAAAAGCTC GCATTATAGA TGTCTATGTA GAGGCGACAA 
4551 ATCAACCTCG AGATATCATA GAAAAGGCTA TC GAT AG AG A TATGTGGATG 
4 601 ACAGCCAACG AAGCTAAGGA TTTTGGTTTA TTGGATGGCA TTTTATTCTC 
4651 CTTCAACGAT CTCTAAATAT TTTATCTATT CTGGAGCAGG AAATCGTTTC 
47 01 CTTCTTGGTG AAACACTTCC TGAGGTTGAA GATGTTCGGT TCTTATGCCA 

47 51 AGAGACGAGG GTTGATGGTT TTTTATATTT AAAGCCCTCT TCTTGTGCTG 
4801 ATGCGCAACT CATTATTTTT AATTCCGATG GATCACGTCC AACGATGTGT 

48 51 GGTAACGGCT TGCGTTGTGC GATTGCTCAC TTAGCTTCTC AGAAGGGAAA 

49 01 ATCGGACATC TCTGTATCTA CGGATAGTGG TCTATATTCA GGATATTTTT 
4951 ATTCTTGGGA TCGTGTGCTT GTAGATATGA CTCTCGCAGA TTGGAGAGCT 
5001 TCTGTTCATC GATTGGAGTC GCGTCCTGAT CCTCTTCCCA AAGAGGTCGT 

50 51 TTGTATCCAT ACGGGAGTGC CTCATGCTGT CGTAATTCTT CCTGAGATTT 
5101 CTACTTTAGA TCTTTCTATC TTAGGTCCTT TTCTTCGCTA TCATCAGACC 
5151 TTCTCTCCAG ATGGGGTGAA TGTCAATTTT GTTCAGATAC TGGGACATTG 
5201 CCAGTTGCGC GTTCGTACTT ACGAACGTGG AGTCGAAGGG GAAACTGCAG 
5251 CTTGTGGAAC AGGGGCTCTA GCTTCTGCTC TTGTTGTGTC AAACTCCTAT 
53 01 GGATGGAAGG AGTCGATCCA AATCCATACT TGGGGTGGAG AGCTTATGAC 
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53 51 TGTGAGTCAA AATAGGGGAC GGGTATATCT TCAGGGCTCT GTAACTAGAG 

54 01 ATTTATAATT AGATGTGATT TTTGATTTTG TCATGCAAGG ATTTTAAAAT 
5451 CTTGTTTAGG GATAGATCTT GCTCTCTAAC TGGGATTTTT CTATAATCGT 
5501 AATTTATGAT GACGTATCCT GTACCACAAA ACCCACTTCT TTTAAGAATC 
5551 CTTCGTCTTA TGGATGCATT CTCTAAGTCT GACGATGAGA GGGACTTTTA 
5601 TTTAGATCGT GTTGAAGGGT TTATTCTCTA CATAGATTTA GATAAAGACC 
5651 AAGAGGATCT AAATAAGATT TACCAAGAAT TAGAAGAGAA TGCCGAGCGG 
57 01 TATTGTTTGA TTCCGAAGTT GACGTTTTAT GAAGTAAAAA AAATCATGGA 
5751 AACGTTTATC AATGAAAAGA TTTATGATAT CGATACCAAA GAAAAGTTCC 
5801 TTGAGATTTT GCAATCCAAG AATGCCCGTG AGCAGTTTTT AGAGTTTATT 
5851 TATGATCACG AGGCAGAGTT AGAAAAGTGG CAGCAATTTT ATGTAGAGCG 

59 01 TTCTCGAATT CGAATTATAG AATGGCTTCG CAATAATAAG TTCCATTTTG 
5951 TCTTTGAAGA AGATCTAGAT TTCACAAAGA ATGTTTTGGA ACAGTTGAAA 
6001 ATACATTTGT TTGATGCCAA GGTGGGGAAA GAAATCACTC AAGCGCGTCA 

60 51 GTTGTTGTCG AACAAAGCTA AGATTTACTA TTCCAATGAA GCATTAAACC 
6101 CTCGTCCGAA ACGAGGCCGT CCTCCGAAGC AATCTGCTAA GGTAGAAACA 
6151 GAAACAACAA TTTCGAGTGA TATTTATACA AAAGTCCCTC AGGCTGCTCG 
6201 TCGTTTCCTT TTCTTACCCG AGATTACTTC ACCCTCTTCA ATTACTTTCT 
6251 CAGAAAAATT TGATACGGAA GAAGAATTTC TTGCTAACTT GCGCGGTTCG 
63 01 ACTCGTGTTG AAGACCAGCT GAATCTTACC AATCTTTCAG AGAGGTTTGC 
6351 TTCTCTTAAA GAGCTTTCGG CTAAGCTTGG TTACGACTCT CTTTCTACTG 
6401 GAGATTTCTT TGGTGATGAT GATGAGAAAG TGGTCACTAA GACGAAGGGG 
6451 AGCAAGCGAG GCCGCAAAAA ATCTTCTTAA TCTTCTATTT TGTGAAGTAG 
6501 TTTATTTTTA GACGCTGTTC TTATTGCTTC TTTACATGAT CTTATTACAA 
6551 ATCTTTCTTA TT.TCTATTTA TTGTTTTGTT AAAATTTTAA CAATAGCTAT 
66 01 TTATTATTAG TCATTTTTTT AATTAAAAAA CTGTTAAAAT TTTTAAAGCT 
6651 AATTTAAGAA ACAGTGAATA GTTCATCATG TCATCACTAC TGAGCTGCGG 
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67 01 AAGAATAGAG CCGACTCGGG TTACCTGTAG CTTAAAGACG TATCTTGAGG 

67 51 ATACGAGTCA GAATCAGTTG AGCACACGTC TAGTTCGGGC AAGTGTCATC 

6801 TTTTTATGCG CATTGTTGAT CATTTTGGTT TGTGTGGCCC TCTCTAGTTT 

6851 GATTCCAAGC ATTATGGCCT TGGCGACCTC TTTTACGGTA ATGGGGTTAA 

6901 TTCTTTTTGT GATGTCACTT CTTGGTGACG TTGCAATTAT AAGTTATCTT 

6951 ACTTATAGCA CTGTTACGAG TTACCGGCAA AATAAGAGAG CTTTTGAGAT 

7 001 TCACAAGCCC GCTCGCTCCG TTTACTACGA GGGGGTCCGC CATTGGGATT 

7 051 TAGGACGATC ATCTTTAGGC ACAGGCGAGA TTCCTATAGT AAGGACGTTA 

7101 TTCTCTCCAT TTCAGAACCA TGGTCTTAAC CATGCCTTAG CTGCTAAAAT 

7151 TTTCCTATTT ATGGAGCATT TCAGCCCTGA GCCACCGAAC GAGCCTTTGG 

7201 TGGATTGGGC CTGTTTGATT CGGGATTTTA GGCCTCACGT CAGTTCTTTG 

7251 TGCTTTGTTA TTGAAAAACA AGGGTCATCG CTGAGGACTA AGGAAGGCAA 

73 01 TACGATTTGT GAGGCTTTCC GCTCTGATTA CGACGCCCAT TTTGCTATGG 

7351 TAGATTGCTA CCGGTTGATC CACTCTAAGT TGATTATAGA GAAAATGGGA 

7401 TTGAAGAATA TCGATATCAT TCCGAGTGTC ATGGTTCGTG AAGATTATCC 

7451 TAGCCGTCCT GGGGAGGGCT ATCGCGAAGG CCTATTACGT ATGTATGGTG 

75 01 GCAAGGGGGC TCTGTGACTT CCCTACTTTA GTTCCTAATG AGCGCTTGCC 

7 551 CATAGGGCCT TTCTTTGTCC CGCAGCACAC TTCCGGTGCG AAGGGTAAGG 

7 601 AGTTTGCTAA AAGGAATTTT TCTATAATTT CGGGATTGGA TGACATATTA 

7 651 AAATTATGTA TTCTTCAAAG GCGTCCTTTT GCTTTGCAGT GGGATAACCT 

77 01 CTCTGTGAAA AGTGATTATG AGGAGGCTGG GCCCGCTATT GGGATACGTT 

77 51 CTCTTGAGCC ACAAGTTTCT CAAATTTCTC CAGCCCACGG CCGGCTATGT 

7 801 AGTACTTTGG TCCAGTGGGC CCCTATCCTT GGTTCTGAGG AGCAGCTAGT 

7 8 51 TTGGTTAGAA GAAACAATGA AGCGCCTAAA GTTTCCTAAA AGTTTAGGTA 

7901 GTAAGGACGC TGTTATTGTG GATTCGGAAA TGGTTCCTGT GAACGCCAAT 

7951 CCTACTCAAG AGATACCTGC AGCTTCCGAG ACTGTAGAGT CTTCACCTGT 

8001 AGCTCCAGGG AATACAACAG ATACCATGCC TGCAGCTTCG GGAACTACAG 
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8051 


ACACCACATC 


TGGGGTTTCA 


GAGGCTGCGG 


CGGCTGAGGC 


TGPPPTPPAT 

X O v_ O 1 Own 1 


8101 


TCTACACCAG 


GGACAGAGGA 


GGAGCCGAGT 


TTTTCTCTGA 

X X X x V* x \_* X VJJTI 


PPT ATPPPPT 


8151 


TGTAGTTCAA 


AATGTTCCCT 


ATCCAGAGCC 


GCCTAAAGAA 


CCTPAPPTPA 


8201 


TGTTTACAGA 


TGAAGAAAAA 


AGTCTGATTT 


TAGAAGCTAC 


TPGTGPGPPT 

X v_- O X UV-uUU X 


8251 


CGTATGGAGT 


TGGACTTGTA 


TAATGGCTAT 


TTAGCTGATT 


ATGAACTTTC 


8301 


TAAGGATGAA 


ATACAGAAAC 


ACGTTCCTGA 


TTTACCTGAG 


AATTGGCGTA 


8351 


CGAATTGGCG 


TTGGTCGGAG 


AGGCTCTATA 


AATTTTTCTT 


TAAAAPAAAP. 

x n^7in_n\_»n_nn.o 


8401 


AAAGAAGGAT 


TAGAAGAAAT 


TTTCTTAAAC 


AAAGAGTTAG 


GGAATATPAT 

X n X On X 


8451 


TCTTGCCCGA 


GGGCTGGCGG 


CAACTCAGTC 


ACAAGCACGT 


ATTAAAPTAT 

n x x nnno x n x 


8501 


TCAATTCTTT 


AGTGGCATGG 


CTCTTGCAAA 


GPTTTAAPPT 

V_rV_ X X XnnHOX 


APPPAPP APP 


8551 


TGTACAGCTA 


AACCTCTTCC 


TAPGTPAAAA 


ptap ArrTPT 


TT A A ATPPr 1 A 
1 1 nn/i 1 Luun 


8601 


ATTCGAGTCT 


AAGCCTAAAA 


ATAAPATPTT 


A A PPP A A TTT 


i X ub i oooo 1 


8651 


CTGATGAGGA 


GATTCTCTTT 


AAGGGGPTAP 


ppptpptapa 


n-Ccvcch ATr 1 

otl. 1 oonn x O 


8701 


GAAGGTTGGT 


ATGACCATCC 


TGATCAAGCT 


GGAGAPATTP 

w*jjii\jn\jn. x x \_, 


dC^CCZCZT A PT 
uu X v-OO 1 no 1 


8751 


CGAGGGTCTG 


GTGCAGGCTG 


GACGTATTTC 


TGGATATTPP 

j. un x ii x x vjvjt 


PAPA ATP APP 
OnOnn l Onoo 


8801 


CGTTTGGGAG 


ATTTGTCCTT 


AGAGGAGTTG 


GTGAAAGAPP 


T APPP APPTT 
x nv-OOnOo x X 


8851 


GTAGAGCTTT 


TGGAGAGTTT 


AGTTGCTTCT 


GGTPAPATTA 

\jvj x vjn^jn x x n 


TPP APTTPTT 
X OOnO i lull 


8901 


TGAGTCTTCG 


GATGAAGAGG 


GTGCTTTTAT 


TATPGATAAP 

X n X On X 


PA APPTAPPA 
OnnOO 1 nuen 


8951 


AGACTGCTAT 


GCTAAAACAG 


CGATTTAAGA 


GTTPTPTPAP 


P APP A APPTT 
Ono unnb O 1 1 


9001 


GTCGGGAGTT 


TTGCTGATGA 


GAGTPTTPPP 


AC A PPT 1 A PPT 


TTAOf^ATTT"!" 1 


9051 


AGTTTAGCGT 


GGGGTAGAGC 


ACTPPAPPAA 

n* — L \_, v_nv, o.nn 


TPTTAPPP AP 


PTPr'TTrTT" A 
O I OO 1 1 OOOn 


9101 


CCAAGCTTGG 


AGATCCTCCA 


TPTTTTATTP 

X ^— J X X X X c\ X X VJ7 


TTTPTPT A f~;T 
X X i i rib 1 


/ibLLAAA 1 oo 


9151 


TAGCCGCTCC 


TAGGAACAAT 


mmrnrpmpmrprprn 
1 X X 1 x X 1 1 1 


TPPPAATATA 

X UO^_*nn X n X n 


A A ATPPTPAT 
Ann 1LL1 on 1 


9201 


TTAGAGAATA 


GGTCTTCAAG 


ATCGTGGTCC 


TTTGGAAGTT 


GCTGGATACT 


9251 


TTTGCTGAGA 


TAGCTATAGG 


CGTCGGGATC 


TTTAGAAACA 


GACTTTCCAA 


9301 


TCCAGGGGAC 


GACAGCACGC 


AAATAGAGCT 


TATGGGCACT 


ATAGGTAGGG 


9351 


TGTGTTTTTT 


TTGGAGGTGT 


GAGCTCTAGA 


ATGCCCAGTT 


TTCCAGAAGG 
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94 01 CATAAGCACT CGGGAGATTT CTTGTAGGGC TTTATGTGGA TCCGAGAGGT 

94 51 TCCTGAGGCC ATAGGCCATC GCTGCTAGGG GATAAGAATG ATTCTCCAAG 

9501 GGCAGTTGAT TAATATCGCT ATGAATAAAA GAGCAAGAGC CCTGGGGAAG 

9551 GTGTTGTTTT GCAATGTCGA GCATTGCTGA GGAAAAGTCG ACGAGAGTTA 

9601 CTGATGCTTG AGGGTGTGCG GCAATATAAC GCTTCGCGAC TTTTCCTGTT 

9651 CCTGCGCAGA GATCCAGGAG AGAGTATCCC GACCCTAGGA TCTGGATCAA 

97 01 AGAGCGATTC CAGAAATGGT GCATTCCTAA AGAGAGTATT GTATTTGTGC 

97 51 GATCATACTT ACTCGCTATG GAATCGAAGA TCTTTTTACA GTCGGGCTTG 

9801 TTGGTAGAGG GTTCCATAAT ATTCCCGGAA TTTTTCAAAG CTTTCGTAGT 

9851 GTTCTTCTCC TAGACGGTAC TGGCATAGGG CATAGTATTC TTGAAGAAGA 

9901 GAAGGGGGCA GACCTGTATG TTGATGAGCT TCTTTAAGGA CTTCTTCGGG 

9951 TGAAGATTCG AACTGTTGGA GGGCTTCTTC CATCGCAAGG TTGGGTAGGG 

10001 GATGTTCTTT CCAAGAGGTG CTGTGTAGAA GAAGAGCAAA TACAAAAGGT 

10051 AGCTTTGTAA GATCATACCA CCCCGAGGCA AGGTCATAGG TTACAAATCC 

10101 AGGAAGTACA GGATGTTGTA GCGCTGCATC TCCGATTAGG AGGAGGCCAT 

10151 CATAATTTTC AGGGGTTTGT CTGAGTACTT TTGTAGTTAT GAATCTTAGG 

10201 ATATGAGGAG TTGGGATGCG CCAGAGATGA CGACAAAGCA CTTTTAAGAG 

10251 TCCTATAGAG GAGCGACTTT CTAAAGTTGC GGCAATCCGA GGTTGCGGTG 

10301 AGTTAAAGAA AGTGGGAGCT GCATAGAGGT TTACACTGAG GATACGTTGG 

10351 TTTGCTGCAA TTCCAAAGCC GGGGACATAC CCCAAGTTAT GAGAGATAGC 

10401 TCCTAGGGAT GAGGTCAAAG CAACATCGAG TTTCCCTTCG ATTAGCAAGT 

10451 TGAGGAGGTC TGCAGGGGGA GCAAGAACAC AGCGAATATC GTTTCTTTTT 

10501 ATGAGTTGTA GGGACAGCGG AAAGGAATTA ATATAACTTA CGCAGCCTAA 

10551 GCTTATACAT GGCTGGAGTT GGTTAGACAT GGCGTTCTCC CTTGTTGTGT 

10601 GATGAGGGCC GCCATTCCCT CAGCGTCCAT TTTAATAGGT TCTTTAGATG 

10651 AGGCCATCTG GAAAACCTTT TCCCCCATAT GTGTTGAAGA AAGGTCATTA 

10701 GCACCACAGG AAAGGAGGTC TAGAGCTGCC TCAATACCTA GGTAATTCCA 
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107 51 TAAGGCTTTC ATATTGGAAA 

10801 TTAAAGATTT TAGAGGGATG 

10851 TTTCCTAGGA CATTATTTTC 

10901 AAAGCCCTGA GTTTCGTCTT 

10951 CGAGGTCTTC AGGTCCTTCT 

11001 TGGATTCCCA GTTGATGAGC 

11051 AGAAAGGCGT TTGGGAGCTA 

11101 CAGCTCCTCC TCCGGGGATG 

11151 AGAACATCGC GAATAGAAAG 

11201 AATGGCAGTA AGAGCTTTGA 

11251 TAGTAAATAG ATCGGAATAG 

113 01 CCCACGATAT GTACTTCTGT 

113 51 TAGAAGATCA TCTGGGGAGT 

11401 CATAGAAAGA GCAAAATTTG 

11451 AGGTACAAGG TTGAGGAGTA 

11501 AACTTGGTCT GCAAAATTCC 

11551 GGAGGAGGAG ATGAAGAGCG 

11601 AGTTTTTCGA ATATGGAGTA 

11651 GCACGTCGTC ATTTGATGAA 

117 01 GTAGCACAGT GCTCTGTTTT 

11751 TTTTTGACAC CCTTTAGAGA 

11801 GTAGAATACC CATCCCTAGA 

11851 AGAAAAAGTG TCATCATAAG 

11901 AAATCCACCC AGATTGGGGA 

11951 GACCCTGAGA TAGTAGAATA 

12001 GGGTCACTGA ACTTAGGGGT 

12 051 TTGCTATCGG GGAAGAAGGT 



AGTTGTCTAA GAAGATTCGG GCTACTGCCA 
GCATGACCCT GGCCTGATTT TCTTAATCTT 
TTGGGCGAAT TTTAGAAGTA TGAAGTTTTT 
GTAAGTCGCG GACTTTTACC ATGTGGGTGA 
TTATGATAGC AGAGCATGGT TATATTGCTA 
CATCTTATGG ATGTTGAGAA AATCAGAAGA 
AGAAATTACG TATTTTGTCG ACGAGGATTT 
GAATCAAGAC CCGCATCTTT TAATGTGAGA 
GTTATCAAGA TCTGAGAGAT AGGCATATTC 
TATGGATCTG AGGATCGTAC TCTTTGATTT 
TATTGCAGAT TGCAGGAGGG GAAACAGCCT 
AATTGGAGTT TTTATATTTT GGATTTGCTG 
AGAGCCATCC TTTAGGGTCT CCAGGTTTTG 
CAGCTGAAGT CACAGAAATT TGTAGGATAG 
GTATACAGTG TCGCCAACCC GTTGTTTGCG 
AGAGTGTGCG TTGATCTTCT TTATTCGTGA 
TCTTCACTGC TTAATCGTTC TTGGGCATCC 
GAGGGGGGAA GTTTTAGGGG GCTGTGGGAG 
CACTTTGATG TACTATTCTC TCGAGATTTT 
GTCACATGTT TTTTTTTGGC AGCAATCTGG 
GGGGCCTGCC AAGCAAATGG GAACCTACAA 
CCTAACAATA CTGTGGCACA GCAGATTACG 
AAATCCTTAG ATAGGATAGT TTCTTAATTT 
ACTCCAGGCC ATAGCATTGT CTGCCTGAGT 
CAAAAGGTGC TTTACCGTTT GGATCTTTTA 
ACCATATCAT CGTATTCATA GTCCAGGTTA 
ATGGAGAACT TCGCCATTGC GGATGATTTC 
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12101 TACAGTCTTG AGTAGGGCAG 

12151 TGACGTTGAG TCCAGGTTTC 

12201 GCTGAAGTGA TGTTGAAGCT 

12251 GCAATGACGT GCGAATAAAG 

123 01 TACAAATGAT AGCCGTCAAC 

12351 AAGTAGTCTT TATAAATTCC 

12401 GAAGCGGAGA TTCTTCTTTA 

12451 CGCTATCTTT ACCTTGGATA 

12501 TCTGAAGATC CCCAGGCATT 

12551 GTAGAAATTC TCAAAGTCAA 

12 601 GAATAGAAAT CATGTCGTGG 
12651 GG AATATGTT TGTATTCTTT 
127 01 ATGAAGGATG TGACGCACTC 

127 51 ATCCGGATAG TGTGATGAAG 

128 01 TGATTGATGA GCTTCCAAAT 
12851 TGATGAAGAA GCATAGAAAT 

129 01 TACAAGTTTC AATATTTTCT 
12951 AGGAGACCCC ACATAAGATT 

13 001 GGCAGAGATG AAAATTTCTT 
13 051 AAATTCCAGG CTCATTGAAA 
13101 TCTGGGATGA AGAGCTGCCA 
13151 AAGCTCGATT CGGGTCTCTT 
13201 CGTCTTCAAA TCGCACGGTG 

132 51 GAGGGAGTAA AGATCTCTAT 

133 01 AGAGAAGACA TCGGGTTCAT 
13 3 51 AGAGGTAAAA GGGTTTGCGA 

134 01 GCATCATCGA CTTGAGGATG 



TGCCTGCCAC ATGACCAGAG ATGTGACGGT 
GACCCTGTGG AGAGTTCGGA GCCCATAGGG 
TAAGACGATC CTAGGTCCTG TTGTAGCGTA 
CTTCAACAAG AGACTCTCGG GTATATTTAT 
CCTGGGGAAT ATTGCACTTG CGGAGAGTCA 
TCGATCGTCG AGACCCCCAG CAACAAATCC 
ATCCTTCAAT TACTGTACCT CGAGGATCTT 
GGGAAGGGGT TGTTTAGAGC GGCTGTGGTT 
ATAAATTTCT ACAACTCTTT CGAACTCGGG 
AACCATGTTC TTTAGAAGCT GTGAACGAAG 
TTGACAGTGC TTTTATAGAG CTTGGCGAGG 
GTGTTTCGAG TGGGACTTTG TTTCCTTGGT 
CCTCGAGATG AGGTTCTCCG CTATATTGGA 
CGATCTTCTT CATTAAAGTC GGAGACAGTT 
ATCTGGAGAG AGGTTCTCTT GATTTTCGAA 
TCAGAGCGCG GTCATCTCGG AAATAACGCA 
TCAGAGTCGA CGCGTTCGGA TTCGCCGTGG 
CGGGGCGGAG TCAGCGAAAC ATTTGATAGG 
GTGTAGAGAG GTTTTTCAAT TGGATGCGAT 
TAGAGATTAG GAAGAATAAC AAAGCCTGTT 
ATTTAAATTT TCTCTAAGAT GCTCGTAGGA 
CAGGAGAGAA GTTGGTGAGG TTCCCGAATT 
ATATCGAAGC GTTTGTTTTT AACGACATAG 
TTTTTTTAGG ACGTTTCCGC GGATATCCAT 
CATAGTTTCC TTCTCCTGTA GGATCGATGT 
CGTTGTGCGA AAAGTTGGGC TCCGTTCCCA 
GTTTGGAGAG GCTCCCATGA CAATAGTGAG 
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13451 


GGTTTCTCCT 


ACTTGAAGTT 


CGTAGGGGAG 


AGTAAACTCG 


AATTGTGGAA 


13501 


CGGGATTGTC 


TTTTACAGGA 


ATGGCGGTTG 


CTTCGATGAT 


TTCGCCTTCT 


13551 


GGCATTTCTG 


CGTAGATTAC 


GTTTCTAGTT 


TGGGAGAGAT 


CTGTCGCGGG 


13601 


GGCTTCCCAA 


TCTGTGGGTT 


TCCCACTTCC 


TGCTAAGTCA 


AATTTACATT 


13651 


TGGTTCCAGC 


TGGTAGTGGT 


GTGGCAAGGG 


AATAAAGAAA 


TTTCCAAGTA 


13701 


GAAATTTGCC 


CTGCTCGAGC 


TATCGAAGGG 


TTAACGTAAC 


AAACAGATCG 


13751 


TCGCATAGTA 


AGAGGGAGGC 


TTTATATGAC 


TTAAAAGCGC 


CATCATATAC 


13801 


TAACAGTGAG 


GTTTTTCTCA 


ATCCCCGTCT 


TTGTTTAGTG 


TTTGTATCGC 


13851 


TTCATCCACA 


GTATTGAATA 


TTTTAAAGTA 


AGAAAGGAAT 


CCTGTAACAT 


13901 


AGAGAGTTTG 


TTCTATGGTT 


TTTGGGACTG 


TAGTCAGGAC 


AATTTTCCCA 


13951 


GAATGTTGTC 


CTACTTGATG 


GTAGCTTTGC 


AGTAGGACTC 


GGATACCTGC 


14001 


ACTGGACATG 


TAATCGAGGT 


GAGCACAGTC 


GAGAATGATA 


TTTTTGGATC 


14051 


CAGCTGCTAG 


GGATTGGGAA 


ATATTTTCTT 


GTACTTCTGG 


AGAAGAAATT 


14101 


CCATCAAGTT 


TTCCGTGGAG 


ATGAAAGATT 


GTTGTTGAGC 


CGTGTTCTTC 


14151 


TTTTTGGATA 


TCACTCATCT 


AGATAGTTCT 


CCTAACTATA 


CGGGAGCTTA 


14201 


AGTTTTCACT 


CTGATAAATC 


TTTAGCTTTT 


TTGCAAAGAG 


ATTTTTATTG 


14251 


GTGATGTTTG 


AGAATTTCGA 


TTGGGGGGAG 


GCAGGATGGG 


ATCGTGGAGT 


14301 


GCAAGGAAAT 


CGGTCCTAGG 


ATGTTTACAT 


TCTTAGGAGA 


TATTGAATTT 


14351 


ACGTTTTCTT 


GGTGTGATTT 


TTAGTTTTCC 


GACATTTCGT 


TCTGTGCAGG 


14401 


TAATGATTTC 


TATATCGAAG 


TTTTCGTGAT 


GGATACGCAT 


TCCTTTTTGG 


14451 


GGAACAGCAC 


CCACTTTATG 


GAAGACATGT 


CCTCCTAGTG 


TATCGTAGCT 


14501 


ATTTTCATGA 


TCGATTTTCA 


AATTGAAGTA 


CTCTTCAGCG 


TCGGAGATAT 


14551 


TCATTCTTCC 


ATCTACAATC 


CAAGAGCTTC 


CGATTTTCTT 


ATAAGGAGTA 


14 b Ul 


TTTTCTTGTA 


CGTCGTGCTC 


GTCTGCGATC 


TCTCCTATAA 


TTTCTTCGAT 


14651 


AATATCTTCC 


ATGGTAGCGA 


TGCCTTCTGT 


GAATCCGTAT 


TCATTGACTA 


14701 


TGATGGCTAG 


ATGGCGATGT 


TTTTGTCGGA 


ACTCTTGGAG 


AAGAGAGGAG 


14751 


GCTTTTTTTA 


TTTCTGGGGC 


ATAGAATGGG 


GGTTTGCTAC 


TGAGGATATG 
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14801 


GGTTGGCTGA 


GGTCGTGGCT 


GCTTGTATAG 


AGCAGTAAGA 


GATCTTTAAC 


14851 


AAGAAGGATT 


CCTGTGATGT 


TGTCTAAGTT 


TTTTTTATAA 


ACGGGAAPGP 


14901 


GACTGTAGCC 


TTCTTCGCTT 


ACGAGAACCA 


GAGCTTCTTG 


TAGTGTAGTT 

X X^\JJ X ^LJ X jii\JJ x X 


14951 


TCTTCGGGAA 


GTGCGAAAAT 


ATCTACTTTT 


GGGATCATGA 


CTTCACGGAP 


15001 


AATGAGGTTA 


TCAAAAGCGG 


AGAGGGCTTC 


GGAGAGCTGG 


CTTTGAAATG 

v- xxx wn-nn x 


15051 


ATGTTGAAGA 


TCGTACTTGT 


TGGTTAGGGC 


GGCGTCTGTA 


AAAGAGCAGT 


15101 


TGCAGTGGGA 


AGAGACCGAG 


TTGGAATACC 


GAAGCTAGAA 


AAGGGAGGTG 


15151 


GGCGGTGGTT 


TCTTTAGGGA 


CTTTTGTAGA 


GATCCATGGG 


GGGAGGAATC 


15201 


CGTAAGCTAT 


CAGGGCGCTT 


AGAGAGTATA 


GGGGCCAGAA 


TAGGAGATPT 


15251 


TTGTGAGCTG 


TTTTTGGAGG 


GAGGAGGGTA 


TAGAGTTTTG 


TPPPPAPAPP 


15301 


TCCATAGAGG 


ATGCAGAGCA 


GCGTGGCGAG 


AATTGTAGPA 


GP APTPPPPA 


15351 


AGGGGGGATA 


CTCTCTTCCT 


TTATCTTTGA 


AGAAPpPTTp 


HTTT APPPTT 


15401 


TTTAGGAATT 


TTGAGGATCC 


GTGACAGGAC 


GGTTGPGTAA 


PPPPP A APPP 


15451 


TAGGAATAGA 


AGAATACAGA 


ATATGGPTAA 


AAPAATATPP 


APPATPTT A A 


15501 


GCTGTTAGCA 


AAGCATGTTT 


TTTTCTTAAC 


ATACACAGGA 


TTTH ATTTTP 

XXX OiT. 1 1 i 


15551 


TTTAACTCTC 


ATTTTTCTCT 


TTTCTTCTGA 


TGAGGTGTCG 


TPPTATPPPA 


15601 


GCATATGGAG 


AATAGAGTGG 


ACGAGGTATC 


TCGAGATTTC 


TTPPTAPATA 

J. l^ui nun X jti 


15651 


TCCTCTTGGT 


TTGGGGATGT 


GTTCTCTAAA 


AACCTAAGAG 


PGGPPTPTPP 


15701 


GCTAATGAAT 


GCTTCTCCTA 


AAACATGAGG 


ATAAGPGGGA 


TPTPPPPP A P 


15751 


CATCAATAGG 


CAGAGTGATC 


GTATCTGTTA 


GAGAAGGATP 


APPA A ATA PP 


15801 


TTATCATGGA 


GTTCTGCAAG 


AGCTTTATCT 


TPTAGGAAPT 


APAT A A A A AT 
nun X nrWin X 


15851 


TTCATTAGTT 


GTTACTTTTA 


AGTGCTCTAA 


GAGPGTAAGA 


APP APPTTPT 


15901 


CTACAGAAAC ' 


CAAATGAATA 


GGAATACATG 


TTTGPTPATT 

XXX \3 V- X X 1 


PP A A A P A TfZT 


15951 


ATTTTGATCT 


TTTCTTGCGT 


CACGCGAATG 


AAATTCCCAT 


AGAGAAAAAC 


16001 


AAACTTATTT 


TAAAATAGGG 


GTCTTAGGTA 


AACCTGTGAC 


TTTTTTCGCT 


16051 


GTACTATCAT 


TCCAACGGCC 


CAACTTACGC 


AAGACTTCTA 


CTCGCTCAAA 


16101 


ACGCTTCAAA 


ACATTTCTTT 


TGGTAACCCC 


TTTGACAGAT 


TTACCATAAC 
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16151 TACGATGTCG AGACATAATC 

16201 AAAAATGCTA TCACTGGTGC 

16251 TTCTTGGAGC TGGCGCTGGA 

16301 TTAGGACTTT TCGCTCTGCG 

163 51 TAACTCGATA TTGAATTTTT 

16401 ACGGAGTACT TTATAAGAAA 

16451 GAAGATGACG TCTTTTAGCT 

16501 GGGAGACGAG GGGCATGCCA 

16551 TATATTTGAA AGCATCTTCG 

16601 TTTCTAAATA GGGTAAGAAA 

16651 GAAGACCGTG ACGAGGGGGT 

16701 TGGGTGCGTC TCCAGCGACG 

16751 AAAGGAACCA TAACTACTTG 

16801 GGTAGTCATG ACTTGTTGCG 

16851 TTCGGTGTAA GATTTTGTCG 

16901 ATTCTCGTGA GGAAACCTAG 

16951 TAAAATGATT TGCAATACGA 

17001 AGCTAAATTT CTCAAAGAAT 

17 051 AGGAAGTTCA TGATCATAGT 

17101 GAGAATAACA AGTCCTGTAA 

17151 ATATGAGGAA ATGGGCATTT 

172 01 AAGATAAAAC TTTTCCCGAA 

172 51 ACATCACGAT ACTCTTGAGC 

173 01 TAGAGTCCCA GAGATTTTTT 

173 51 GGCGTTGCCT ATTGTTTTCC 

174 01 CCGGGTCCAT TGGTGAGAAG 
174 51 TTGGATCATA AGTTCTTTGC 



CTGCTCTAGA AATAAACCTA TTTTCGGGAT 
TCAATGCATT CGTATGCAAT TTATATACAA 
TGCACAATGG CATACTTAGG TTTACGTTTT 
CCGAGCTTGT TTACTCATTC GTGTCATAAA 
TAATTTCTCC AGACAACGGA AATTGAGGAT 
AAGGATAGTA AAAGAAGAGT TTTTTTTCAA 
GCCTTGATCT TGGTGTAGCT CGTCAGGGAG 
TGGGGGTTGA GAGGACTCCA CAGGAGATAA 
ATTTTCATAT CTAGGAATAC GATATCAGAT 
CCCTGAGGTG GGGTTGGGTG TTGTTGGGAT 
CGTCTTCCTT TTCTCCTGTG CAGCATACTG 
AGACCGATGC ATTGAACATT TGCGTTAGGG 
TTTGAAGGAT CCTGATTTTG ATCCAAATAT 
CAGCTTTATA CACTGTTTTA ATGATGGGAA 
TAGATAGAGA GTAGGGATTT AAAAATCATA 
GAGCACTGTG GCGAAAAAGA GACCGAAGAG 
ATTTTAGAAG AGCTCTATGT T TAG TAT AAA 
TCCGAAGCCA AGCCTACGAA GGGTTGGGTT 
AACAATAGCA ATAGTAATTG CTAGAGGAAG 
TAAAGTATTT TTTCATGATT CTCCTGCAAG 
GTTTCTTTAC TATACAGCTT AAGATTATTT 
TCTTCTGGGG ATAGGAGAAA TCTCCATGGG 
ATAATCGATG CCGATCCGGG CAGTTGCTGT 
CTTTGCTGAT ATAGAGAGCT GGGGTATTTA 
AAAGAGATTC CTAGAGCTTG GCACACTTTT 
GTGTGGGGGT TTATCTCTCC ATTGGCGGCG 
CTTGATCAGG AAGGATGGCC CGGATCAGGA 
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17 501 CGGCATGGGG AATGTCCTCA GGTCCAGTGA CAACATTCAA TAGGTGATGC 

17 551 ATGCCATAGC AACGGTAGAG GTAAGCAGAG CCTCCTTTCA GGTACATCGC 

17 601 TCTGTTCCTC TGAGTTTTTC TGTAGTTGTA GGCGTGGCAT GCTTTGTCAT 

17 651 CAGGGCCACG ATACGCTTCG GTTTCTACAA TGTAACCTGA AGTTATCAGA 

17701 CCCTCATGTG TTGTGATGAG TTTATGTCCT AAAAGCTGTT GCGCTAGTGT 

17751 AATTACATCT TCCGATAGAA AAAAATGTTC TTGTAGCACG TTACGAGGCT 

17801 CTTTTTTTCG TTCCTTTTTT CTTAGAAGGC GTTTTCTTTA TTTTCTTAGG 

17851 TTTATCTTCT GTGGTTGTCG CTATAGACCA GACGATTTTT TGCGTAAGGA 

17901 GATTCACGGA ATCAATAGTG ACTTTTATAG AAGCTCCAGG TTTCATTTTA 

17951 TCTGGGATAG ATTCTGGAAG AGCGTTTTTC TTTAGGGAAT ATTCTTTAGG 

18001 GAGTTCTGCT GCTGCAATGA ACCCTTCATG GCAGAATTCG GTCACTACAA 

18051 ATGAGAGTCC TTCATGATTT GCAGTGATGA TATACGCATG GTATGTAGTT 

18101 TTAGGTTGCT CTTGCAAAAA TTTATTTATG AACCGAGTTT TTTTGAGGTT 

18151 TTCGAAAGAA TTTTCTGCTT TTGCGGATAC TCGTTCTTTT GTAGAGCATG 

18201 CTCTTACGAT AATTTCGAGG TGCGTTTGGT CTATAGATAG GGGGTTGAAG 

18251 AGAAGCCTGT GAACAATAAG ATCGATATAT CTACGTATGG GACTCGTAAA 

18301 GTGGGTGTAG TAGTCGAGCT TAAGTCCGTA ATGACCTTTA TTTTCTGTAG 

18351 AGTAGGAGGC TGTTTTCATA CTTCGGACAA ACTGCGAGTG TAGAACTTGC 

18401 TCTAGGGGAT GTCCTGCTGA CGTAGTTTGC AAAAGGTATT GGTAATCAGG 

184 51 TTCTTGTGTG GGAGTGAACG TGATATCAAA GCCCATGTTT TTTGCCAATT 

18501 CTTGGAAGGC GAGTAGGTTT TCATCATTGG GAGGTTCGTG ACTACGAAAA 

18551 GGTAGAGAAA CGCCTTGATG GGAGATATGA TAGGCGACCA CTTCGTTTGC 

18601 TTTAAGCATA AACTCTTCGA TGAGTTTATG GGAGAAGGTC TGGTGGTTTT 

18651 CTATCAGAGC TACGGGTTCT TGAAGATTAT CCAAGGACAT AGTGACTGAG 

18701 GGGAGGACAA AGCGAATGCA ACCACGTTCT TCACGGATAT CGGAAAACTT 

18751 TTTACTTAGA GTGGCCATCT CATTGAGGAT TTTTGAGAGG GGGTGGGAGT 

18801 GTTTCTTTTC AATGATGTTA TCGACTTCAT CGTAGGTCAT ACGATATTTG 
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18851 CTTCGAATGA CGCTACGGAA AATCTGGTAA TCTGAAAGAT GACCTGATTT 

18901 TGTAAACGTC ATAAATACGG ATACAGCGAG TCTATCAACG TTTGGTTTTA 

18951 AGCTGCAGAG ATTATCAGAG AGTGCTGATG GCAACATGGG AATGACTTTC 

19001 CCTGGGAAAT ATGTAGAGTT ACAGCGTTTA GCAGCTTCTT TGTCTAGGTG 

19051 AGAATGTGGG GTAACGTAGT GGGAGACGTC TGCGATGTGT ACACCAAGAA 

19101 TGTAATTGTT ATTATGATCG TAGGTGAGGG AGATGGCATC GTCGAAGTCT 

19151 CTGGCTGTGG AAGAGTCTAT GGTGAAACAG AGGAGATCAC GGAGATCTTT 

19201 GCGAGAGTGG AGAACTTGGG TAATGTGTTT TTGAGAGAAA AGGCTTGCTT 

192 51 CTTCAATGAC CTCTGGGGGG AATTCTTCGG CAAGGTTATA TTCGGCTTGA 
19301 ATTGCCTGAA AGTCCGCTTT AGCGTTGGTG ATGTGGCCAA TAAATTCGAG 

193 51 CATTTGTAAG GCTGGAGAGG CTCCTTCTTG GGGTTTATCT ACCCAGGGAG 
19401 GAGTGCTCAG AAGAATGCGA TCGCCGATTT TGTAAGTGCG TCCGGGAAGG 
19451 AGTTCTACTG GAATTAAAGA TTGGGATCCC GACATGCTTG TGTAGGCAAG 
19501 TGCTGATGTG GGACTGACTA GTGAGGTGAT CGTTCCTACG AGTGTTGTTT 
19551 TTCCTCTTGC GAGTACTTCG CTGATAGTGC CTTTGAGTTT TTGTCCGTCT 
19601 CTTGGATAGG GAAGCACGGA GACAATCACG TGGTCACCAT CTAGAGCCCC 
19651 GCGTAAATCT CGGGCGGGAA CAAAAATATC AAATGGGTAT TCTTCGGGGT 
19701 TGTCGGGAGA AACAAAACCG AAACCTTTTC TAGCATGAAC AAATAGGGTT 
19751 CCTGGAATAA AAATCTTCAA GGATTTACCG TATGTTCTTC TCCCTGGTTT 
19801 TCTTTTTGGT TTTTTCAACA ATTGGGCTCC GCCTGTAAGT TTAGGACATT 
19851 GAACGAGAAA TCCCGTAGTT TCACTCGTGA ACTGAGTGGC TCAACAAAAT 
199 01 TTTCCCTTTT AAGTGGGTTT TGTGATTATG AAAGGAGCAG TCTCAAATTC 
19951 AAAGCCAATT GTACTAAAGA AGCTCTGTTA TTGCAACTCC TTGATCAGAG 
20001 AATAAGAGAA TGGAACTGTT TTCTGTTTAT AAGGAAGTTT CCCATCCTCT 
20051 TATGAGGATG GGAATAGAGA AACTTAAATT GAAAATTTTG ATTACTTATC 
2 0101 GTCGTTATCA ATAATTTCTA CATCAGCTTC TTCGATATGG TCTTCTGAAG 
20151 AACCGTTATT TGAAGGAGGC TTCGTACTGA AACTATGTTT TTTCAAATCT 
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2 0201 TCTGTATTGA TGTTAGGTCC ACCTTTAGCA TTGGCTGCCG ATGATGCTGC 

20251 TGCTGATGCA GACTGCGATT GCATAGACTC TCCAATTTTT TGCATATGCT 

20301 TGCTTAGGTC TTCAGTAACC TCTTTAATTT TTTCAATAGG AGCGTCATCT 

2 0351 TTGAGTGCGT TGCGCACGTT TTCGATTCGC TCTTCGATTT CTTTAACTAA 

20401 AGTTTCAGGA ATTTGCTCCT TATAATCTTT AATAGCTTTT TCGGCTCTGA 

20451 AGATCATGCT ATCGGCTTCA TTTTTAGCAT CTGAAGCTTC ACGACGTTTT 

20501 TTATCTTCTT CCTTATTAAT TTCGGCATCT CGAACCATTC TTTGGATTTC 

20551 ATCTTCTTGA AGTCCTGAGC TTGCTTCGAT ACGAATTTTC TGTTCTTTAC 

2 0601 CGCTGGCAAC ATCTTTAGCT GAGACATGGA AAATTCCGTT TGCATCGATA 

2 0651 TCGAAGGAGA CTTCGATTTG AGGATGGCCT CGAGGAGCCG GAGGGATATC 

2 0701 TGTAAGATCG AATCTTCCGA TTTCCTTGTT ATCTTTGGCC ATGGGACGCT 

20751 CTCCTTGGAG AACTACGATG GTAACCGCAG CTGGTTATCA GCAGCTGTGG 

20801 AGAAGATTTG TTTTTTCTGT GTAGGGATTG TAGTATTTCT CTCTACCAGA 

20851 GTCGTCATGA CGCCTCCTAG AGTTTCGATA CCCAGAGATA GGGGGATAAC 

209 01 GTCTAGAAGT AGAACATCCT TAACTTCTCC GCCAAGAACA CCACCTTGAA 

2 0951 TTGCGGCTCC AATAGCAACA ACTTCGTCGG GGTTGACTCC TTTATTAGGC 

21001 TCTTTGCCGA AGAGTTCTTT TACAGTTTCT TGCACTGCGG GCATTCTTGA 

21051 CATACCTCCA ACTAAGAGAA CATCATCGAT ATCCTTAGCG GAAAGTTTTG 

21101 CGTCACTGAG TGCTTTGATG CATGGAGATT TTGTTCTTTC GATTAGAGAG 

21151 GCTGCGAGTT TCTCGAATTG CGCACGTGTG AGTGTCAATG CAAGGTGTTT 

212 01 AGGTCCTTGT GCATCCATTG TGATGAATGG CTGATTGATT TCTGTGGAAG 
21251 AGACTCCTGA AAGTTCTATT TTTGCTTTCT CAGCAGCATC TTTAAGTCTT 

213 01 TGTAAGGCCA TATTATCTTT GCTAAGATCA ATGCCTTCTT GTTTTTTGAA 

213 51 TTCTTCGATC ATCCATTTGA TAATGACTTC ATCAAAGTCG TCTCCACCGA 

214 01 GGAGAGTATC TCCATTTGTA GATAGAACTT CGAAGACGCC ATCACCGATT 
214 51 TCTAGGATGG AGATATCAAA AGTTCCTCCA CCAAGGTCGA AGACAGCGAT 
21501 TTTTTTATCA CCGACTTTAT CGATTCCGTA GGCAAGAGCT GCTGCGGTAG 
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21551 GTTCTGGAAT GATACGTTTT 

21601 TTTGTGGATG CTCGTTGAGA 

21651 TGCTTCTGTG ACAGTTTCGC 

217 01 TCATTAAGAT TTGTGCGCCA 

21751 ACTTCGAAAA CGGCATCACC 

21801 GGTTTGGATT TCCGAAGCTA 

21851 TTGTAGAGCC GAGAGTTTTT 

21901 GGAATCCCCA CTAATTTCTC 

21951 GGTTCTTGTT CCTTCGGATG 

22 001 TAACAGATAC GCAGGAGTTT 

22 051 CTTGATTTTT TGTGTTCACT 

22101 TTCTATTCTT TATTTTCTTT 

22151 AGCTACCCGA ATCGGGCGTT 

22201 CTAAAATCGT CCCCTCAGGA 

22251 TCGTGTAGGA AGGGGTTAAA 

223 01 ACCTTTTTCC TCGAAGATTT 

223 51 CGAGGGCCCA ATTTTTTACA 

224 01 GCTTTCTCCA TGCTTTCTAT 
22451 TAAAGCATAC TGCATAAGTT 
22501 AATTCTCAGA TTCTGCTAGA 
22 5 51 AATTCGGTTT TTAGGGTGAC 
22 601 TTCGTTTTGA ACATTGCTTT 
22 651 CTGTCATAAC GTCTCCTTAG 
227 01 TTCTTAAAAT AGGCTCATTC 
227 51 CTGAAGGATA GT-TTAAATTT 
22801 TTTATTCGCA AATAGTTTGA 
22 851 TGATCGGGCC TAGGATACCT 



ACATCTAGAC CTGCAATGCG TCCAGCATCT 
ATCATTGAAG TATGCGGGGA CGGTGATCAC 
CTAGATAAGC ATCAGCTGTC TCTTTCATTT 
ATTTCTTCTG GAGTGTATTG TTTGCCATCA 
TTTAGATCCG GAGGTGACTG TATAAGGAAC 
CTTCAGAGTA CTTACGGCCA ATAAAGCGTT 
TCTGGATTTG TCACTGCTTG ACGTTTTGCT 
ATTACCTTTG AAGGCAACGA TCGATGGCGT 
ATGTAATTAC TTTAGCTTGT CCTCCTTCCA 
GTTGTGCCTA AGTCTATACC TATAATTTTG 
CATGTTTGGT ACCTAATCTC TAGGGGTTAT 
GGGAGTAGGA GCTTTAGCGA CTTTAACTTT 
CTCCTATTTT ATATCCCTTT GCAAACTCTT 
ACTTCAGAAG TCTCTTCTGT TTGCACCGCT 
CTTTTGGCCT ATTGAAGAAT ATTCAATAAT 
GTTTGAATTG GTTGAGAATC ATGTTGAATC 
TCGTCGGACA TTTGTGTAGC AAATCCGAGG 
GGGATTGAGA AAGTCTATTA AAGTATTTTC 
CTTGGCGTTC TTTTTGTAAG CGTTTTCTAG 
GCCATGAGAT ACTTATCGTT TTTTTCTTTT 
GATTTCCTGT TGCAAATGTT CAACTTCATT 
CGTGTTGTTC CTCATTTTCA GGTGGGGTAT 
AGGGTAATAG TTTTATAGAA GAGTACTCCG 
GAAAGCTTAC AGTTAGAGGT GAGTGGTCTT 
GTAGAAACTT TGTGTCAGGG TTTCATTTAT 
GCAAAGGAAG AGCTTCCTTA TAAGGAAGAT 
AAAGCTCCGA GTGGAGAGCG ATT CAT AT AA 
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22901 TAGGGAATAG TAATTACAGA 

22951 AGAAAGCTCC TTCCCTATGA 

23001 TATTTAGAAG CTCACACATT 

23 051 AGAGCTAGAA CTTCAGGATC 

23101 CATTCCTGTT TGATAGAGAT 

23151 GATAGCGGAC AACCACCTCA 

23201 TTTTTCGAAA GTTCCTCATT 

23251 GAATTTTTCT ATACGTTTGA 

233 01 ATAGGGTGTC TGTGAAGATC 

233 51 GCTCTTTGCT TATCGACCTG 

234 01 TTCAAAGCGT GGGGAAGAAA 
234 51 GAAGTTCCGT AGCTTTTTGT 
2 3501 GGAAGCTGAC TGATCTTATC 
23551 GCATTCTTCT TGGTGATCTA 
23 601 TTCTTCCTCC GGAAGTATGA 
23 651 TCTGCAAAGT AATTTCTTAT 
237 01 TTCCTTTAAA GTTTTAGACC 
237 51 CTGTTGTAGC AAACAGGATA 
23 801 TTGGATCTAG CCATCTCGAG 

23 851 TATAGGAAAA GAACCCGAGA 
23901 TGAATGCTAA CTTTTTTACG 
23951 GTGATTCTGA CACCAAGTAG 
24001 CTCCAGATCG GGATTCAATT 

24 051 CCTAAAATAA GCTTATAAGG 
24101 AAATCCGAQT CTTTCATCTC 
24151 TTTGTAGCTC ATGATAAATA 
242 01 CCTCCGTTAA AGGCGATAGT 



ACATCCTGGA TTCGAGGTCC CTAAAATATC 
ACGCTGTAGC TCTTCCTTTA TGCATTCCTA 
TGTCTGCGAT TTTCAAAAAG AGAGAGTCCT 
TTTAAACGCT TCGTATTTCA GTAGTTTCGA 
CTTCTTCACT AAAGTTGCAG TAGCGTGTTA 
TTATAGAGGG ACATGCTCAG GTGTTCTTCT 
TGTGGGGAGC TTTCGGATGT AGTTCTGCAG 
TAGAAAGAGT ATCGCAAGCT TCAGGCAGCC 
TGACCAAACT CCGTAGAGAG GATGGTGACA 
TGTAATTTGA ATATTGGTTA CGGAATCATT 
AAAACGTAGG CAGGTCTAGG ATTTCTCCAA 
AGATCCTTGA TAATATTGCG ACTTTCGCTA 
AAAAATGGGG GCAGAAATCT CAGCTTCTGG 
CATAGTGACG TAATGCTAGG TCTGTAGGGA 
TTTTTTTTTA AGAATCCTTC AGCTTCAAGT 
AGTTGCCGTA CTCAAATCAG AGCAAAAACT 
CTACAGGCTG CCCTGTTTTT AGGTACAACT 
TCAAGGATTT TTGAATCTCG CTTTGAGACT 
TCCTACTAGA ACAATCGTAA CTGAGAGCAT 
AGGTCAAGAG AATTTAGCAC TCGAATTGAA 
AGGAGGGCAG CGATCAAAGA GCTAGGCTAA 
GGAAGGCCTC CGGGGAGACT GTATACTTTT 
TCGAATATTC CCGAAGATTG GTAGGACTTT 
AATGCCGATA AGGTCACTGT CTTTAAGTTT 
GATCATCAAG AAGGGGCTCA TAGCCTTGAC 
GTTTCCGCAA GCTCTTGAGA TACAGTGTCT 
GATAGAGAAG GGAGCGAGTG CTTTTGGCCA 
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24251 AACAATACCA CGGTCGTCGG 

243 01 TTCCGACTCC AATGCCGTAG 
24351 TGTTCATCTT GGAAGTTTAC 

244 01 ATTGAAAATA TGAGCAACTT 
24451 CAGGATTTTC AGGACATGTG 
24501 TATTGGGGGG GGAGGAGGTC 
24551 ATCTTTAGCA TTGCCCGCAC 
24 601 CGTCTGCGAA AAAGTCTATG 

24 651 TCTGTGCCTA GAACGCGTTC 
247 01 ATCGGCATTC AGTTTGGAAG 

247 51 CTCTCATTCC AATGGCAATG 

248 01 ACGACAAGGG TTTTTAAAAT 
24851 TAGAGCTTCT ATTGTTGTAA 
24901 GAAACTCGCG ATCGTAGGCA 

249 51 ATATTAGCTC CATAGGAACC 
25001 GCAAAGGACC TGAAATTCCT 
2 5051 CAGCTGTAAC GATGACATAG 
25101 TACGCAGAGC GGAGTTTTTC 
25151 TGAGAAGGTA TAGCTGTCTT 
25201 CGAATCGAGG GCGAATCTCG 
25251 TGGAGAGGAA GTTGTCTTTT 
253 01 GATGACCTCT TCATGTGTAG 
253 51 CTTTGAGAGT GTAGAGCAGT 
25401 GTATGTTGCC AAAGTTCAGC 
25451 ACCTCCAATC GCATTAAGTT 
2 5501 CCACGCGCCA TAACAGGGGT 

25 551 AATAGGTATC CTGCCTTTTC 



CAAGCTGTTC TACACAAGCG GCTAATGTTC 
GTCCCCATCC AGCACTGCTG GGTTTGCCCG 
CTCAAAACTA TCGGTATAGC GTGTCCCGAG 
CTATGCCTTG ATAAATGCGG TAAGGATGGC 
TCTCCCTCTT CAGCGAGTAG AAAGTCACCG 
GCGATCCCAG TTTACATTTA CGTAGTGCTT 
AAACAAAGTT CGTCATTGGG GACGTTGTTT 
GGACAGTTTA GGGGACCGAT GAATCCTTTT 
GATTTCTTCA TCAGAAGCTA GAGCAATATC 
CGACCTTCAC TAGGTTGACT TGCCGATCTC 
AATTTTTCTT CATTTGAGTA GGAGAGTTTT 
TTTATGTAAG GGGATAGAGA AGAAGTTTGC 
TCCCAGGGGT GGCCACTTCT TCGACGGGAA 
TGCTGTGGAG GAATGGAGAC AGCAGCCTCA 
GCTGACGCAG ATCGTGTCCT CGCCTAGAGA 
CAGACTTTCC TTTGCCGATT TTCCCTCCAT 
GCAAGACCGA GACGATCAAA GATCTTACTA 
ATATTGCTCG TTCATTTGTT CGGGAGAGTC 
CCATAAGGAG CTCTCGAGAG CGAATGAGAC 
TCTCGGAATT TTGTAGCAAT TTGGTAAAGG 
TGAGGAGAGC CATTGTGCAA CAAAAGAGCA 
GAGCTAGGCA ATGAGATTTT CCTTCGCGGT 
CCTTCCGAAG TAAATGCCTC CCATCTCCCT 
ATTGTGGAGA AGTGGGAGTA GAAGTTCTTG 
CCTCTCTAAT GATGTTCATC ATCTTGGAGA 
GTATAGGTAT AGACTCCTTT ACTTACTTTA 
TAGGAGCTCG TTTGAGAGCA CAGCAGCGCT 
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2 5601 TTTATTTGCA TTTTTTGAAG 

25651 AGTGGGGCTG TTATCTTCAG 

25701 TTACAGGCTA TTGCGATTTT 

25751 AGTTTAGCTA TGAAGGAGGT 

25801 GTTTTGCGTT GGGTTCTTAA 

25851 TTTTAATTAT ATTTTTGTTT 

25901 AATTAAGTGT ATACTTATTA 

2 5951 TCGTCAGTAA ATCAAAGCTC 

26001 TCCTGAATCT ACGGAAGAAA 

26051 AAGCCACGCA TGCTGTGGCT 

2 6101 GAAGGTGTGG GGACCTCATC 

2 6151 CGAGATTGTA GCTGAAGTTT 

2 6201 CATCACTTGT AGAGCGTGTT 

26251 CAGTCCCTTT TCACTAGCTT 

2 6301 ATGGAAATCT TCAACGAGAA 

2 6351 CGGATATAGC GAGGCTGGAA 

2 6401 GGCCATGCGA ACCAGTTTCA 

26451 AACAGATGTA CATCACAAGC 

26501 TGGCGTTTGA CAATAATGAT 

26551 CTTGATGTAG ACGCTGAAGG 

26601 TCCGCGACTG GTGCTTACTG 

26651 TGAATCTACC TACTGTAGAA 

26701 TCTTCGTCTG ATCCTAGGGT 

26751 GCTCAATGAA TTACGTCGTC 

26801 GTTGCTATGA CAACATCGTG 

2 6851 AACCTTTTGC CTGGGCTGGG 

2 6901 TCAAGAAGAC CAGAGGTCTT 



TCTTATAAAA GAGTTGAGAC GTTTTCATAG 
TGATTGATCG TCAAGAGTCT ATAGAAAAAA 
ATATCGAGAG TCTTTTAAGC TAGTAGATTT 
ATCTTAACAC AGGGATTTTT CTAGATCAAT 
CTTAACAATT GGGGTCAGGA TTATGTATTA 
TTATTTAGAA ATTTAATAAT CTCAACTTAT 
ATCTTTTATT TTTGTAATTG TAGTACTATG 
TGGAACCCCG AATCCAGAAG AGGTAACTTC 
ACAAAAATGT TGTTTCTTCA GATGAGGCGC 
CTTCCTATAG TCACTCAACT TTCTCTTCCT 
TGAAGAAACG GCGAGTAATC CGAGGGTAGA 
CTTCGAGTCG GGCGGTTGCT GATCAGATCT 
GGAGAGCTTT TAGACGACCT TAAGGGTGCC 
TCAGTCAGAG TTGAAAAACT GTCTTCCGGC 
GACTCGAAAC TCGAGGTGCT GGGGATAATG 
TTATTTCGTA GCGATTACGA GGCTGTCTTA 
TGGGAAGGCT CATCTCATTT TAAGTAAGTT 
TACAGGGACT CAGTCGTGAA GATCTTTCCC 
AGGGTTCTTG AGCATCTGGG TTCGTTAGGG 
TAATTGGTCT CTTTCTTGTG AGAGGGGGAT 
CTGACAGTAT GCTTGTCCAG ATCAAGAAAG 
GAATTGCGGA CTCTTCAGGG AACAACGGAA 
TGAAGAAAGT TTGTCTTGCT GTGAAAGATT 
TTTGGGCGAA TTTTGTAGGT TTTATTTCGA 
TTTGTTTTGA TGTGGATAGT GAGACGGATT 
GTGTTTGCCT TTCCATAATC CCGATGCTTC 
CTTCCGGAGA GCGTTCTACA AGGAGAGAAC 
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2 6951 GCCTTTCTCG GCGATCTGAC TTATCTGAGG AAGAGATGAT TGTGAGAGCT 

27 001 GAGGGAGAGT CTATACATCC TGAATCTCCC CATGGAGATG GCCGTAACCA 

27 051 ACCTAGTCGA GGTGATAAGC AAGACTCTGA TAGTGAGGAA GAGACGGAGT 

27101 TATAATAAAG AATGATCTCT TCTGAGTAGA AACTATCGAA TCAGAATTGC 

27151 CAGCTTTTAC TCTCACCCAG TCCTATTTTT ATAGTTCAAA TTGCGGTGGT 

27201 TTGTTTATTG TTTTTTTGAA TTTTTGTTAT ATGGTGGCGT CTAATATTGC 

27251 CGATTAAGGA GGGCGCCCTT TATGAATAGA AGAAAAGCAA GATGGGTAGT 

27 3 01 GGCATTGTTC GCAATGACGG CGCTCATTTC TGTTGGGTGT TGTCCTTGGT 

27 351 CACAAGCGAA ATCAAGATGT TCTATTGATA AGTATATTCC TGTAGTCAAT 

2 7 401 CGTTTACTAG AAGTTTGTGG ACTTCCTGAA GCTGAGAATG TTGAGGATTT 

27 451 AATCGAGTCC TCGTCTGCTT GGGTACTGAC TCCTGAAGAA CGTTTTTCTG 

27 501 GAGAGTTAGT CTCTATCTGT CAGGTTAAAG ATGAGCATGC TTTCTATAAC 

27 551 GATTTGTCTT TATTACATAT GACTCAGGCT GTGCCTTCGT ATTCTGCAAC 

27 601 GTATGATTGT GCTGTAGTTT TTGGCGGGCC TTTGCCAGCG CTACGTCAGC 

2 7 651 GCTTAGATTT TTTGGTGCGA GAGTGGCAGC GTGGCGTGCG CTTTAAGAAA 

27701 ATCGTTTTTC TATGTGGAGA GCGAGGGCGC TATCAGTCTA TTGAAGAACA 

277 51 AGAGCATTTC TTTGATTCTC GGTACAATCC TTTCCCTACT GAAGAGAACT 

27 801 GGGAATCTGG TAACCGAGTT ACTCCCTCTT CTGAAGAAGA GATTGCCAAA 

27 851 TTTGTTTGGA TGCAAATGCT TTTACCTAGA GCATGGCGAG ATAGTACTTC 

27901 AGGAGTCAGA GTGACATTTC TTCTAGCAAA GCCAGAGGAA AATCGTGTGG 

27951 TTGCGAATCG TAAGGACACC TTACTTTTAT TCCGTTCTTA TCAAGAAGCG 

28001 TTTCCGGGAC GCGTGTTATT TGTAAGTAGT CAACCCTTTA TCGGTTTAGA 

28051 TGCTTGCAGG GTCGGGCAGT TTTTCAAAGG GGAAAGCTAT GATCTTGCTG 

28101 GACCTGGATT TGCTCAAGGA GTCTTGAAGT ATCATTGGGC TCCAAGGATT 

28151 TGTCTACATA CTTTAGCGGA ATGGTTAAAG GAAACGAACG GCTGCTTAAA 

28201 TATTTCAGAG GGTTGTTTTG GATGATTCAT GGATCTTAGA GGTTAAAGTC 

2 8251 ACTCCAAAAG CCAAAGAGAA CAAAATTGTA GGCTTTGATG GACAAGCTTT 
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28301 GAAGGTCCGT GTTACCGAAC 

2 8351 TAATTTCTTT ATTAGCAAAA 

2 8401 TTAATTGCAG GAGAAACTTC 

28451 AGTTCAAGAC ATTATTTTTT 

28501 AGCTACGTTT TTCCCTTTCT 

2 8551 ATAAGAACTG GTTGCGTTCT 

28601 ATGATATCTT CATTAAAGGT 

28651 TTTACGCAGG CTGTCCACAT 

28701 TTTTTGCGAT TTGTTTTCCT 

28751 CAGGCTCCTT CGGAAATTAA 

28801 CTCTTTGAGG ATTTCCTGAA 

28851 GGGAGAGGGG AATGAAGATA 

2 8901 ATCCGGGTTT TGAATGTACT 

2 8951 TCCTGAAGCA TGGAGTTTCT 

290 01 AAGGTTCGGA GGGTTTTAAA 

29051 GCTTGTATAG CAAAGCAGTA 

29101 GTAGATATTC ATTGTGTTTT 

29151 CAGCGGCATA GATCACAGGT 

29201 GTAGCAATAG TTCCTAAGGT 

29251 TAAGGAGCGA GCAATTTTCC 

293 01 GGACAACAAT TTCAGGAAGG 

293 51 CTTTTACTGC AATCTTCTAA 

29401 GATTTTAGGA GAGGAGACGC 

29451 AGATGAAAGA GCAAAACAGA 

29501 CGTAGAATTG TCATGCAAGA 

29551 AGGCGGTCAC AGGCTAAAGC 

29 601 ACGAACAAAT CCTTGTCCAC 



CCCCAGAAAA GGGTAAGGCC AATGATGCTG 
GCTTTATCCT TACCGAAGCG TGATGTCACT 
TCGAAAGAAA AAGTTTCTTC TTCCTAACAG 
CTTTGCATAT AGACGTATAG CCTAACTCAC 
CAGATTTTTC CAATTTTTTG CTTTTAAAAG 
GTTTTATGAA GCTTGATTCC TAAGTACTTG 
GGTTGTAGGT GACAGGCGTT GAGCAATGAT 
CGTGATTGTT ATAGAGTAGG TGGTGCACAA 
GATTTTTTGT AATCCACGCT ACAGGCAATG 
GGAGGTATCG TCGGTAATGA TAGGGATTTT 
GGAATGCGGT GCCTTCTTTA TGAGAAAGTG 
GCTGAGGGGC GCTTGTCGAT AGCCTGGCGT 
GCTTGTAATA GAGATCTCAA TGACCTCAAT 
TAACAATTTC TTTTTGGAGA TCTGAGGGGA 
TACACGATAG ATTGTGCATT GGTAGCTACG 
TTGATTGATG TCTAGAGTGT CATTCACTCC 
TAGGAAGGGT TAGGCTTTCG CGATCAGGAA 
TTCTGTGTTT CAATGTGGCT CATGACCTTC 
GACAATCGCC ACGACATTTT TATCGGTATG 
TAGCCTTTAC GATACTGTCT TCAGCATTTA 
TTCTCAAAAT CTTTCAAGGT TTCTATACAG 
TAGGGGATGG GGAAAGGATA AGAAAATTGC 
TATCTGGTTG AGAACCACAA GTGGCTACAT 
GAAAAGAAAA ATAAGTACTG AGAGAGTTTA 
ACCATCGTTT CTTTTAGTGA AGCGGTACAG 
GATATTTTGT GGTTGTGTCA GAGCGGAGAA 
AGGAACCAAA ACCGTGGCCG GGAGTCACTG 
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29 651 CAATATGATA CTGATGTAAG AAGAAATCAA ACGCTTCTTC ATCAGAGATT 

29701 CCTTCAGGGA, GTTCTACCCA AAGGTAAGGG GCATGATCGC CACCATGAAC 

297 51 TGAGAATCCT GCAGTTTCTA AGCTTTTTTT AAGTTTCTGA GCATTGGTTA 

29801 GATATAAAGA GATGGCGGGA GGTGTCGGAA ATAAATCTAG GCCGTAATAC 

29851 CCTGCTTCTT GCATGAGGAG AGATGCTCCG TTAAATGTAG TCGCAAAGAG 

29901 CCGTTTCCAA TCGTTGATCA TAGGTTCGTT ATTGTCATAG GTGAGTTCTT 

29951 TAGGGATCAC GTTCCAGGCA AGGCGCATGC CAGTAAAGCC TAATGATTTA 

3 0001 GAGAAAGAGT TGATTTCTAT AGCACAATAT TTTGCTTCAG GGATTTCGAA 

3 0051 GATGCTTTTA GGTAGGCTAG GATCTGAGAC AAAGGCGCTA TAGGCCGCAT 

3 0101 CAAAAATAAG AACGGTTCCG TGCTGATTCG CGTAGTTCAC AAGTGCTTGG 

3 0151 AGTTGTTGAA AGGTTAGAAC TGTTCCTGTG GGGTTGTTAG GATAGCATAG 

3 0201 ACAAAGAATG TCTAGGGATT GTTGGTTCGG AAGTTCTGGA ATAAACCCAG 

30251 TTTCTTTTCT GCATGCTAGG GGGATAATGT CGCGGATTCC TGTAATGTGG 

30301 GCAATGTCTC TATAAGCTGG ATAGACAGGA TCCTGTAGAC CTAGAGTCTT 

30351 TTCTGAGCCA AAAAAAGAAA AGAGACGGAA GATATCAGGT TTGGCACCAT 

30401 CCGAAATAAA AATCTCTTCA GGGGAGATTC TATTTTCATA GACTTCAGAG 

30451 GCAATTTTTG TGCGTAATTT TTCTAATCCG GTTTCTGGGC CGTACCCACG 

3 0501 ATAGGTCTCT TGTTTCTCTT GAGAAACGCA GAACTCTTTG ATTGCCTGAG 

3 0551 TAATAGAGCG GCAGAGAGGT TGTGTCGTAT CTCCGATAGA AAGATCTATG 

3 0601 ACAGAGATTT CTGGATTCTC CTTGCGAAAC TGAGCAAGCT TTTTACTAAT 

30651 TTCAGAAAAT AGATACTGAG GCTTGAGAAG AGAAAAGTGG GGATTTCTAC 

3 0701 GCATAGCGTG CTCCAGGTAA GGGCGTGATT CACTTGGAAG ACATCTTAAG 

3 0751 AAAGCCCCAG CTTTTTGGAT AGCCATTTTC TGATTTTTCT TTAACCGCCT 

3 0801 CTATAAAGGC AAGGAGTGCT CTGTAGTTTC GGTTAATTAT TAAGATATAT 

3 0851 TTATAATTCT GUTTATTAAA AGTTTTTTAA ATCTTTTCTA ATTTGCTCAC 

3 0901 TATAATTAAA GGATAAGATT TGAAAAAATT TTTTAGGTAA TTATGATAAG 

3 0951 GGTCAATCCT TATGGAAGTT ATAGGGGTAG GAATCCTTCT CCAGAAGATG 
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31001 GGAAAAAGGA TGTACCCCTT TCAGGGAACT CTCGCTTGCA TCGTCGTGGT 

31051 GGGATTCGTA GAAAGCATAA GAGTGCTTCA GTTGGGGTGA CCTCGGGTTC 

31101 TAAGACGGGG AAAGCTTCTT TAGAGAAGAA GGTCAAAGGC ATTTCAGAAG 

31151 CCCATTTCAA ATAATCCAAG ACAGAAGGTT CTCATTCGAA GACAAGCAAA 

31201 GG ATTCGTTG GCAGATTTGT TCAATGGATT AGAACGTTTA CAGGACGTGG 

31251 AAGCAAGAAG CGTTCTCCCT CAAGTTTTTC TCCAACGCAC CCTTACATAC 

313 01 GTTTGCGAAC TTACACACGC AGTCCAAAAC AGAGTGGTGT AGAGAGAAAA 
31351 CAAGAAGATG CTGAGACCTC ATTTATAGAG ACACCCAAAG GGATCTTGAA 
31401 AAAGCCTGGA AACAAAGACC CCAAAGGCAA GCACGTCCAT TGGAAAGACA 

314 51 GCTAATCCGG ATCAGAGCTG AGATCCCTCC TTCTCTTCAT TAAGAGGGTT 
31501 CAGAGTTTTA GGTCTCTGCG GTAGAAGGGA GGGAATCCAT AAGGAGAAGA 
31551 CCAGTGGATT TTCAACGGAG ATTCTAGAAG GAAACAAGCT GATTCTACAT 
31601 GGAGATGTTT GTAGGTCTGT TGTGAATTCC TCTCTTCACA GATCGGAAAG 
31651 ATCAATGAAA AGAGTTAGGG TCTCTTAGTT TTAAGAATTG CTATACAATA 
317 01 TTGCGAAATA AATACTTTCT TCTCTATACA CTGTCTTTAC GATGAAGACA 
317 51 GCTTTTCACT CTTGCTATTC TTGGTTTTGT TGGCTCTTTA GCTTCTTGGT 
31801 ACTCTTTGTG GGTGGCATCG CTGGGGGAGA GCCTTTGTGC CCCGATTGCA 
31851 AATACGAAAC TAAGTCTGTT TTACGTTCGG ATCAGCTGCC GGATCATCTC 
319 01 TGGAACTATG AAAACGACTG TTATCTTACA GGTTATGTGC AGTCTCTTTT 
31951 GGACATGCAT TTTTTAGATA GCCGTACGCA AGTTGTTATT GAGAAGAATA 
32001 GAGCGTATCT TTTCTCTTTG CCTGTAGATT CGAGTTTATC AGAAGCCATT 
32051 ACCAACTTTG TTAGGGATCT TCCCTTCATA TGTGCTGTGG AGATTTGCGA 
32101 GCGTCCTTAT 'GGTGAATGCA TAACGAGATC TTCTGCGGAG CGTCCCTTAC 
32151 TCCCTAAAGA GAAAACTTTA GGAATGCCAA TTTTCTGCGG CAAAGAAGGG 

322 01 GTATGGTTAC CTCAAAATAC CATTTTGTTT TCTCCTTTGA TTGCAGATCC 
32251 TCGTCAGGTT ACCAACAGTG CTGGCAfTCG TTTTAATGAG AAGGTCGTGG 

323 01 GGAATCGGGT AGGTGCTACC ATCTTTGGGG GAGATTTTAT TCTCCTGCGT 
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32351 CTTTTTGATG TTTCTCGATT 

324 01 AGGAGTCTTC TCAGTTTTTG 

324 51 ATTCAGATTT CTTTGTTGCC 

32 501 AGTTTTAGGT TTCGATTGTG 

32551 TATTCTTACG CATCCAAATT 

32601 TCGATCTCTT CATTTCGTTT 

32 651 GGCTGCGGTT ATATTGTAAG 

327 01 TTACTGTGAA TGGGGTGCGG 

32751 ATCTCCACGC ACAACCGATT 

32 801 CAGAAATTTG GCTTGGATCA 
32851 ATTTCAAGAA ATCGGAAGGA 
32901 GATTTTCTAA AGAAGGCCAA 
32951 TTCCGTCTTA CCTATGGATT 

33 001 AGGGTGTTCA GGTCCTATGG 
3 3 051 TTGCAATGCC TGCCTTTGCT 
33101 CATTCAGAAC CTCGAATGGT 
33151 TGTGATCTTG GATTTCCAAT 
33201 CAAGAGCTTT TTTCCCTGTT 
33251 AAATCGGTTC CCATGCCTCC 
33301 GTCGGCCTGA CGTTCTAAAA 
33351 CTACATAGGC ATTCGTGGTA 
33401 ATGGAGAGTT CTGTAGCTAC 
33451 CCCTCCTGTA AGGATTGCTA 
33501 TGAATTGCTG AGAAAGAGTA 
33 551 TCATCATAAT CCCCAGCGAT 
33 601 GGATTGAGCA AACTGTTCAG 
3 3 651 CTTTATTTCT GCGTTTGATG 



CCATGTAGAT TGTGATTTCG GAATTCAAGG 
ATTTAGATCA TCCTGAATCG TGCATGGTAA 
GGACTCTGGT CAGGGGCTAT AGATAAATGG 
GCACCTCTCG TCCCATTTAG GAGATGAGTT 
TCCCAAGATT TAATTTGAGT GATGAGGGCG 
CGTTACACAC CACAGATCCG CTTGTATGGC 
TAGGGATCTT ACTTTTCCTG AGCGGCCGTT 
AACTCAGACC TTTTGGTCTG AGAGAAGGAA 
TTCGCGATGC ATTTCCGTTG TTGGGAAGAA 
AAGCTATATT TTAGGCATGG AGTGGGCCAA 
AAATCCGTGC TGTTTTAGAA TATCATCAGG 
TTCATTCGTG AACCGTGTAA TTACTACGGT 
CTAAACGATA ACAAAACCAT CTTCAGGGAC 
GCAGCGTATG ATTGAGATAT CTGCGGAAGA 
GAGGATAGGC AGAATAGGCA GTTGTGTACC 
TCCTACAGCA TGATTGGAAT TATACAAAGC 
AGTCTACAGG TCCGATTAGG AAGACGGGAA 
TTGAGACTAA TAAGCTCCAG AAGGAGTTCG 
GATAACAAAT ACAGCAAGGT CGACATGGAA 
GATCAGGAAT AGCATAGCTC ATTTTAGCTT 
TCCAAGCTAA TTAGATTCCC ACAAGAGAGT 
ACGATTCGCG AGTTCCATAG CTCCAGAACC 
ACGGTGTCTG TGGTGGAAAT TCTGGGATCG 
TGCATTCCTG TCAGGAGCTC ACGGAGAAAC 
TAGGCAGGAA CCATGAATTC CTATAAAGTA 
CTTGATGTTT AGGGACAAAC ATGCCCACAT 
TATTGTAAGA GTCGTTTCGA TTCTAAGTCT 
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33 7 01 GCCCAAAATA CAGAAATTCC TGCAAAATAT AGATCGAGAA GGAAAGAGCG 

337 51 ATCTCGATTC GAGAAAAACT CTCCAGAAGT GGGAGAGGGA ATCTGAAAAT 

33801 AGATATGTTG CAGGTAATAG CGAGAGTAGT TAGAGAGGAA CATGCCCTTC 

33 851 AGCGATGCTG AAGGGAAGTA GCGGGAAAAT AAAACTCCTT GGCTTGTGAT 
33901 ATGATCTGTT TCCATGGCTT TTAAAAAAGG GAAACAAGGT TGGTCTTCAA 
33951 TGTGCTTTTG AATTTCCCTA GCATGTCTTT CATCTGATGG GGAGATTCGA 
34001 GGTTTGATGA TCCAAGAGTC TTGGGAGAGC TCAAGCAGCT CACTACCTTT 

34 051 GGAGATAAAC ATCGCAGCTT GATCTTCGCC TTCCGGTATG GATTCAAAAA 
34101 CACGAAATAC CTCTTGAGGA GATTCTAAGG TTTCCTGGAG CATATCTCTA 
34151 TAGAAGAAAA ACGAATGCTC TTTGTAAGGC TCAAGAGTAA AAAATTCTAA 
34201 AGGTATTCTC TCAATAGGTT CTGAAGTGCT GCCGTAGAAT TCATAAATAT 
34251 CTCCAGATTC TTGTGTGGTA GGTTCGAGAA TATCCGCTGC GGTGTGACGA 
3 4301 AGCCCTTGGG GGAGTAAGTC CTGAACGACT CTTGCAAATA CGGTTCGGAT 
3 4351 GTGCAGAGGC TCTGTCTTTA TGAGAAGAAT TTTATGATCT TCGGGAACGG 
34401 GAGGACGATC TGTTACCATT TGATACAAAG GAAGAAACTT ACGTATTTTT 
34451 AAATGGGGAC GCGTGAGTGA TTTGCTCATT AAGGGAAGGA ACCCATAAAT 
34501 TGTCTCTTCG TAACAGATTG TTCCTGGAAG GATCGGAAGG AAGACAACAA 
34551 GCCGATCATT AATGATCTCT AGAGTGATGA AGTGCTCAAG TTTTTTCCCA 
34 601 AAGCGTAGGA GCGGAGATCC TGTACGGTCT GTGTGCGTAA ACATCCTGTT 
34651 GAGATAACAA GGCGAACGTA CGAGTCGGCG ATCATCAGCA GCAAAGAGCT 
34701 TGCAGACAAA ACTTCCAGGC TCTAGGAGCT CCAACATAGC AGTGGCTATA 
347 51 GGATCTTGGC TCATGAAGAG AACGTGTAGA CGAGCTTCTT TTCGGGCTTT 
34801 ATTTAGCTCC AAGTGGTTTA AAACGGCTTC GACACCTAGT TGGGCTAAGG 
3 48 51 AACTTTTTAA ATTTACTTGT ATACACTGTT GAGGCAGATG AAATCCAAGA 
34901 AAGTACGCAG GAATATTCTC AATGAGGACC TCTCCTTCGT AAATGTGGGG 
34951 CGAGAGTTTT TTCAAATGGG AAACGAGTCG TCCGTCTGGG GAGGCTGCAT 
3 5001 CATGATGCGC GTGGAGTAGG TTATACATAA TTCATCGTAT TTTAAAGTTA 
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35051 


AAAGCAATAG 


GTAAGGTCCC 


TCTTTGCGCT 


TCTAAGACTA 


GCGTTTCCAG 


35101 


AAGAATTTCC 


TACAAGGACT 


TCATTCTGCT 


TGTTGTTATA 


ATTTTGTGCA 


35151 


ATTTGTCAAG 


GAGGGGGTCT 


TGGAACGAGA 


AAATTGTGTT 


GTGGAAGGGA 


35201 


ATTTTAGGAA 


AAAAGTTTCT 


TAGAAAATAG 


GGCAGAAACT 


TTCTCTTGGA 


35251 


GAGATTGAGT 


AAAATGTAAA 


GAATAGTCTT 


TGCAATTGAG 


AATTATTTCT 


35301 


CTCTGGTTAC 


AATGGAGGAT 


TGGCTAAGGA 


GGATAGTAGG 


TATGCAGATT 


35351 


CCAAGAAGCA 


TTGGTACTCA 


CGATGGTTCT 


TTCCATGCGG 


ATGAGGTCAC 


35401 


AGCGTGTGCT 


CTCCTTATTA 


TTTTCGATCT 


TGTGGATGAA 


AATAAAATTA 


35451 


TACGCTCTCG 


AGATCCTGTC 


GTATTATCGA 


AATGTGAATA 


TGTTTGTGAT 


35501 


GTCGGTGGTG 


TTTATTCTAT 


AGAAAACAAG 


CGTTTTGATC 


ATCATCAAGT 


35551 


CTCTTATGAT 


GGATCTTGGA 


GTAGTGCAGG 


TATGATTCTG 


CATTATCTTA 


35601 


AAGAGTTTGG 


TTATATGGAT 


TGTGAAGAAT 


ATCATTTCCT 


TAACAACACT 


35651 


TTGGTACATG 


GTGTGGATGA 


ACAAGATAAT 


GGCAGATTCT 


TCTCTAAGGA 


35701 


GGGATTTTGT 


TCGTTTTCTG 


ATATTATTAA 


AATTTATAAT 


CCTCGCGAGG 


35751 


AAGAAGAAAC 


TAATTCGGAT 


GCGGATTTTT 


CTTGTGCTTT 


GCATTTTACC 


35801 


ATCGACTTTT 


TGTGTCGGCT 


AAGGAAGAAG 


TTTCAGTATG 


ATCGAGTTTG 


35851 


TAGGGGGATT 


GTCAGAGAAG 


CCATGGAAAC 


CGAGGATATG 


TGTTTATATT 


35901 


TTGATCGTCC 


TTTAGCATGG 


CAAGAAAATT 


TCTTTTTTTT 


AGGGGGAGAG 


35951 


AAGCACCCTG 


CAGCTTTTGT 


TTGTTTTCCT 


TCCTGCGATC 


AATGGATTTT 


36001 


ACGAGGGATT 


CCTCCGAATT 


TAGATCGCCG 


TATGGACGTT 


CGTGTTCCTT 


36051 


TCCCTGAGAA 


TTGGGCAGGT 


TTGTTAGGTA 


AAGAGTTGTC 


CAAAGTATCA 


36101 


GGGATTCCTG 


GGGCTGTGTT 


CTGCCATAAA 


GGTCTTTTCC 


TTTCTGTATG 


36151 


GACAAATAGA 


GAAAGTTGCC 


AACGTGCTTT 


GCGGTTAACG 


TTACAAGATC 


j> bzu 1 


GAGGGATCAT 


ATGACAGTAT 


TCAAACAAAT 


TATCGATGGA 


TTGATAGATT 


36251 


GTGAAAAGGT 


ATTTGAAAAC 


GAAAATTTCA 


TAGCTATAAA 


AGATCGTTTT 


36301 


CCTCAAGCTC 


CTGTTCATCT 


TCTTATCATT 


CCTAAAAAAC 


CTATACCACG 


36351 


ATTTCAGGAT 


ATCCCAGGGG 


ATGAGATGAT 


TTTAATGGCA 


GAGGCTGGAA 



36401 AGATCGTGCA AGAGCTTGCT 

3 6451 GTGGTTATCA ACAACGGTGC 

36501 TATTCATCTT TTAGGTGGGC 

3 6551 TTGTTTCTGT GGATCCTTGT 

3 6601 TGATCCAAGA TTTTGCAAAA 

36651 TCTAAAGAGT ATTCTTTACT 

3 6701 TCAAAATTCT TCTTTTGATG 

3 6751 TTTCCTATCC AGAGTTAGCT 

3 6801 ATTCAAGTTC TGCGTGAGGG 

3 6851 TGTGAGTGTC CTTGCTATTG 

3 6901 TCCTGCTCCA AAGTTGTAAT 

36951 CTTCAGGTTG CTGTGAACTA 

37001 AGAGCTTGCC CGTAATGATG 

37 051 AGGTGGTCGC TCTTTTACAG 

37101 CGTGCTGAGA ACAAACTTGT 

37151 GGCTTGCTTG GAACTCTCTT 

37201 ACGATATTGA TCAAGCGTTG 

37251 TTGCCAGAGA CTACTGAGAT 

373 01 TGAAGTGCAG GAGTCTCTCT 

373 51 TACAGAATCA CAAAGAGTTT 

374 01 TCTCCATTTG CAAAAGTACG 
374 51 TGGAGACCCT TTGGGCAGAG 
37 501 AACCTCTTGT GTGTGAGGCA 
37 551 CATGGAGTCC CTTTGGCAAA 
37 601 GGCTGCTGCG AACCTCTCCA 
37 651 AAAGAGCTGG AGATGTGATT 
377 01 TGGGCTATAG AGTATTTCTT 



GCAGAATTTG GAATTGCCGA TGGGTATCGT 
TGAAGGAGGA CAGGCGGTAT TTCACTTACA 
GTCCTTTAGG TGCTATAGCC TAATTTTCTT 
TCGGCTCGGA GTCCCTCCGT TATCAATTGT 
GTTTCAGAAG AGGGCATAGG CCTTTTGGAG 
TCAGGCTAAG CTAGTTTTAA GGGCTCTGGC 
ATTGGTTTAG AAGTTTTAAG AAGTGTCAGA 
CATGATCGCG ATGTCTTAGA AGAATTTGGG 
AATCGAAAAT CCTTCCGTGA CCGTTCGTGC 
GGCTTGCTAG AGATTTTCGC TTGGTCCCTC 
GATGACAGTG CTATTGTTCG ATCTTTGGCT 
TGGCTCTGAA AGTTTAAAAA AGGCCATTGT 
ATTCTATTCA TGTTCGGATT ACAGCATATC 
ATAGAGGAGC TATTGCCATT TTTAAGAGAG 
AGATAGTGTA GAACGTCGAG AGGCGTGGAA 
CTCAATTTCT AGAGACGGGT GTAGCTAAGG 
TTCACTTGTG AAGTGTTGCG TAACGGTATG 
TTTTACAGAA CTCTTATCTG TAGAGCATCC 
TACTTTCTGC TTTAGCTTGG AGTCATCAGC 
CTTAGTAAAG TGCGCCATGT GATGTGCACT 
TTTTCAAGCT GCTGCACTTC TCCATCTGCA 
ACTCTCTGGT TGAGGGCTTG CGCTCTCCTC 
GCTTCGGCGG CTCTCTGCTC TTTAGGAATC 
GGAGCATTTG GAGAGCCTTT CTTCTCGAAA 
TTTTGCTTCT TGTGAGCCGT GAAGATATTG 
GCTCGCTACC TCTCCAATCC TGAAATGTGC 
ATGGGATGCA CAATGGAATT TACGTGGTGA 
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377 51 TACCTTCCCT CTATATTCGG ATATGATTAA ACGTGAGATT GGTAGGAAGC 

37801 TCATTCGCCT TTTGGCAGTA GCTCGCTATA GCCAAGCCAA GGCTGTAACA 

37 851 GCAACGTTCC TTTCAGGACA GCAAGCTCAG GGATGGAGCT TTTTTTCTGG 

37901 AATGTTCTGG GAAGAGGGAG ATGTGAAAAC TTCTGAGGAT TTGGTTACAG 

37951 ATGCTTGCTT TGCAGCAAAG TTGGAAGGAG CGTTAGCCTC GCTATGTCAG 

38001 AAAAAAGATC AAGCTTCCCT ACAGAGGGTC TCTCAACTTT ATAATGACAG 

38051 CCGTTGGCAA GATAAATTAG CAATCTTAGA GAGCGTTGCT TTTTCTGAGA 

38101 ATCTTGATGC TGTGCCTTTT CTTCTAGACT GCTGCCATCA CGAAGCTCCT 

3 8151 TCGCTGCGAA GTGCAGCAGC GGGTGCTCTT TTCTCTATTT TCAAATAAAT 

38201 ATTAATAAAA TTATTCAAGA TATAGAAGAA AAACCACGCA GTTGTAATTT 

3 8251 CTATTTCTTA AAAATAATTT TTCAGACTGA CTTTATTCTT TCATTTTTAA 

3 8301 GTCTTTGGAG ATAGAAAACT TTGTTATAGA TTTTTATCTG GTAGCTTTTA 

3 8351 TAATTTATGA AGAGCGTAAG CTCAGAGCCT GTATGTCATG CACAACCTGT 

38401 ATATAAAATC AGATTTGTTT TTTGAATCTC TATTCTCGTT AAGATTTCGT 

3 8451 TATTCTGGAC GTTATCTCCA TCACCACTCC TAATTTTCCT AGCATTTCTA 

38501 TCTTTAAGCT CAGCACCGTA GCTTGCTTAA AGGAAATATT TTTCATTTAG 

3 8551 GTTGTGGAGT TCTTTATTTT ATGAATTTTT CATTATTTTT ATTTTTCCTG 

3 8601 ATAGCTATTC AGGGAATCTG CTTGTACGTG GGACGTCGTG GTAGCAAAAA 

38651 GGTAGAAGAT CGCGAGAGCT ATTTTCTTGC AGGAAGGAGT TTAAAAATCT 

3 8701 TTCCTTTGAT GATGACATTC ATTGCCACCC AAATCGGTGG CGGTGTACTT 

38751 CTTGGGGCTG CTGAAGAGGC CTTCTGTTAT GGTTATGGGG GGATTCTTTA 

3 8801 TCCTTTAGGA GTCGCTTTAG GGTTGATTTT CTTAGGAATG GGGCCCGGGA 

38851 AGCGGTTGGC AGAGGGATCG TTAACGACCG TAGTCTCTAT CTTTGAAGTG 

38901 TTTTATGGTT CTAAAAAGCT CCGTAAGATC GCATTTTTAT TATCCGCAGG 

3 8951 TTCCTTATTT TTCATCCTGG TCGCTCAGGT GATTGCTTTA GATCGGTTGT 

39 001 TTAGCAGCTT CCCTTTTGGC AAGTACGTAA CCGTAGCATT TTGGATTGTC 

3 9051 TTAGCATCCT ATACCTCAAC AGGAGGGTTT CGCGGGGTCG TACGTACTGA 
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39101 TGTGATCCAA GCAGGATTTC TTCTTATTGC GGTGCTCGTC TGTGGTGTTT 

39151 CTGTATGGCT CTCTGTCCCT AAATCCTTGT CTGTGTTGGA TCCTTTCCAA 

3 9201 TCACTTCCTT GTGCGAAGCT TTCCAATTGG ATATTCATGC CTATGCTCTT 

39251 TATGCTTGTT GAGCAGGATA TGGTGCAAAG GTGTGTGGCT GCCTCCTCTC 

39301 CAAAACGCTT GCAATGGGCG GCTGTAGGCG CAGGCCTTGT TCTTCTTCTT 

39351 TTTAACTTTA TCCCTTTATT TTTAGGTTCT TTAGGAGCTA AAGCAGGCCT 

3 9401 TAAAGCAGGA TGCCCTCTGA TTGATACCAT TGCATATTTT TGCAATCCCT 

3 9451 CACTAGCAGC TGTGATGGCT GCTGCCATCG GCGTTGCGAT TCTCTCTACC 

39501 GCGGACTCTC TTATGAATGC TGTAAGCCAG CTAATCGCTG AAGAATACCC 

3 9551 TACGTTGAAA GCCCCTTATT ATCGTTATTT AGTATTGGGT TTGGCGGTTG 

3 9601 CAGCTCCTCT TGTTGCTATT GGTTTTACAA ACATCGTAGA TGTCTTGATT 

39 651 TTAAGCTATA GCCTGTCAGT GTGTTGTCTT TCAGTCCCTG TGGGTTTCTA 

3 9701 TCTTCTAGCT CCTAAAGGTC GCCGTGTGAG CGGAGCTGCT GCTTGGGCAG 

39751 GAGTGCTCGT TGGTGCTCTG GGCTATGGAT GGGTTCAGAT AGTCTCTTTG 

3 9801 GGGATGTTTG GGGAGCTATT GGCTTGGGTA GGTTCTCTAG TCGCCTTTTC 
39851 CTTTGTAGGA TTTATTGAGA TCACTTGGAA AAACAAAGTC AAAACGCAAA 
39901 CTTAGATAAC CACTGCATGA GAAGATATAA CTAAAATAGA TCCTGAGTTG 
39951 TTTAGGTTTC TCTTAGATCT GATAGGTTGC GCTTAGTAAG AGATCGTCAG 

4 0 001 TTTTTTAAGT TGTGTTTAGA ATCTGATACC TCTCCTTCTT TTCCAAGAAG 
4 0051 AAGAGGGGTT CGTTTTATTT TTTATTATTC ATTCGTAGGG GCGGGAAGCT 
40101 GTTTTAACTG ATAGAGCAGG TCGATAAGAG AGGAGAGCAG GGCTTGATAG 
4 0151 AGCGTCTCTT CAGAATGGTG AAGAGTGCCC TTAAGAAGCT TTCTGCAATA 
40201 CTTTAATAGC 'CAATACAGAG CGCAAAGTAA ACACTGTAAA AGGTAAAATT 
40251 TACACGCTTG TTTTATCATA AACCTCCAAA GAAAGACGCG TGTTGGCGTT 
40301 CTTCCAATTG GCTGTTAGAT AGTGAAAATA GTATTTACTA TTCAATAAAA 
4 0351 ATATTTATAT TAAATATAAT AAAACAAAAT TTCTAATAAA CTTTTTAAAA 
4 04 01 GTATTTGGCT AAGTTTAATT TAAGAGTTCT CAAATAAAAG ATTTTTTAGT 

30 



40451 CTCTTTTATT TGAGAATCAT CAACCGATTT CAAGAAATCG CATGAGCCCG 

40501 TGATTCTGAT GGGCAGAAGT GCTTTTTAGC AAATAGGTGA AGCTCCTCGG 

40551 GCGTGATCTG TTGCTTCGCG TCACAAAGAG CCTTTTCAGG GTGTGCATGC 

40601 ACTTCGATCA TCAGACCGTC GGCACCTACC G AG AG AC C AG CAGAGGCGAG 

4 0651 AGGAAGAACT AGAGAACGCT TCCCCGCTGC GTGGGAAGGA TCTACAATTA 

4 0701 CAGGGAGAGA AGAGATCTCT TTAAGGAGAG CCACGGTATT GAGATCTAGC 

40751 GTGTAGCGCG TAGAGTGCTC AAAGGTACGA ATTCCTCGTT CACAAAGGAT 

40801 TACCCCAGGA CAGGAGGGAG AAGAAGCAAG GATGTACTCC GCTGCGCATA 

40851 GCCACTCTTC AAGAGTAGCT GCTGGACTGC GTTTTAGGAT AATCGGACGA 

4 0901 TGTGATTTGC TGACCTCTTG TAAAAGAGGG GTGTTATGCA TGTTTTTGGC 

4 OS 51 TCCGATACGG AGGATATCCA CATGTTCGGC AGTAATTTCA ACATCTCGGA 

410 01 CATCTAAAAC TTCGGTTTCT GTAGGGAGAC CATGGATGCT CTGTGCTTCC 

410 51 TTATGCCAAA GCACACACTC TTTCTCCCAT CCTTGAAACG AAAATGGGCT 

41101 TGTCCGTGGT TTTCTGATTG ATCCTCGGAA TACCTGAGCT CCTGCTTCTT 

41151 TAACTGTAAG AGCTGAAGAG ACTGTATGCT CGTAACTTTC TAAGGTGCAG 

412 01 GGGCCTGCGA TCAGTATTGG CGATCCTTCT CCAAACGATA GATTTGGAGA 

412 51 AATAGGAACG GTATGGACCT CGTCAGGATG CTGTTTGAGG GTGCGCGGTA 

413 01 GGGGATAGGT AAACGTAAGA ATAAGTACCT CATGCAAAAC TAGGGATTTG 

413 51 AGCTGTCCGG TTTTGAGTTC TGATCTTCAG TTCTAGGGTT TCTAGGCAAG 
41401 AGACAACCGT AGTGGTTGGG GAAGCGGATT AAGAAGATAC GAGCGTCTCT 

414 51 TTCCGGAGAA ATTGTATTTA AAAATCCATG CATCGCAACA AAGTTTTCAA 
41501 TATTTTCTTC TGAGACAGTA TCTCGATCAG ACATGATCAG ACAAACAGCA 
41551 AACGGCTGCA AAGGTACAGT AGTCATAGGA GCGACGTCGT TAGCCTCAGC 

416 01 TTGCTCTTTG ACGTCATCCA AGATATCACC GAGTTCATTA GAACTTAGGG 
41651 AGTGGAGGTA GCCTATCAGT GATTGTCTAA AACCAGAGCG ACAGAAACCA 

417 01 TCAGATGCTG TGACCGTGTT TTTTAAGAGA TTACAAATCA ACTGATCTCG 
417 51 AGCTCTAGAG TATTGATTTA TAGTCGCTGT AGCTAGAGGG AAACCGTATT 
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41801 TTAACGAAAG TAAAAGCTTA TCTTTTACTT GTGTAAATCT ATGAGAAGCC 

41851 TCTCTTTGGA AAGCGTCTAA TCGAGCAGCT AGCGGAGTCT TTTTTTGTAG 

41901 AAGTTTAGGG TAGTTGTTAA GGAAACAAAA GATTAAAAAA GATTTCGAAA 

41951 ACTGATCACT TAAAGCTGTG GGAGAAGAAG GCGAACCCAG TTTTGATCCG 

42 001 AAGAGATTTA ACCGTGTTAA AAAAACGTTC CACCCATTCC TAAAGGTACT 

42051 TTCGATTTCT TTTTCTAAGA GTATGGCCTC GGTTTCCAGT TGATAATTTG 

42101 GGAAAGTAGC TCGTAGAAAC TGCATACAGT TCTTTTCAAG TATGGGATCT 

42151 AAAACCATCA CTTCAGGATG TCGTGACCAG AACTTGCAGA ACGTCTCTTT 

42201 ATCTTTATTA GAAGGGGTTG AAGGATAGCG TGCTGGAGGC TTGAGTACTA 

422 51 GAGACCCTAG AATGGAACGT AGAAGTTTTT GGAGTTTCTT TTCGTTTTCT 

42301 ACTTTTTTAA AGGCATATTG TAGAAAAGGA AGTAGAAGCA TTTTTAACTC 

42351 TTTAACGGCA TCTTTAGTTT CTGGACGATT ATTCACTTGA GGATGCTGTA 

42401 AAATAAAGAA TAAGAGCTGT ACTGCCTGTG TATAGAGGAG GTCCTGCTCT 

42451 TGTGAGAGAC CACTGCCTTC ATCACGCGTT AAACGTAAAC TTCTTGCCGA 

42501 GCAGACAGTA TTGAAGTGCT CTTTTAATTC AAGCTTACTA TGAATTGCGA 

42551 CTAATAGCTT GTTTACAAGA TTTGAAAGTA AATCCGGCTG ATTGCTTAAA 

42 601 GGAGGGGTTT CAAAGGATGT GGATATTACA TTCTTTAATA GGTTTGCTAT 

42 651 GCGTTTATGT AGTGTTTCGA GTTGAGGCAT AGAGCTCTCT AACGTCAAAC 

427 01 TCTCATGAGT CGTAAGGATA GTCATGATGT TAGAAGAAAT CACCTCTCGT 

42751 TGCCAAGAAG AATCTAAATC TTCATTAAGG TAGCTGAGGG AGGCGTTACT 

42801 TATTTGTAAG ATCTCTTGTA TAGAGCGTTG GGGGGGCGCG GACGCTATAA 

42851 TGATATCCTC TAATGATTTG TAGAAAGTAG CGCAGAGTTT GTGAAATCTT 

42901 ACAGAGGGGC AGGTACATAG AAAAGCAAAG AAAGAAAGAG CTGGAGACCA 

42951 AGGGAGATCC CCAGATCGTC TGTTTTCTAT AGTTGCAAGA AAGCGAGAGT 

43001 ATTGACATCG GATTGCTTCA GGGAGGTGAT CTTTGACTAT AGCATTGAAT 

4 3 051 CTCTCAGTAT CCCACCCTGA ATCAGCAATG CGTCGACTCA GCTTAGCAAA 

43101 AGCCTCTCGT ATTTCTCTTT CATATTGCTT TCGAAGTTGT TGATCTTCTG 
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4 3151 GAGACATCTT TTGAAGCTCT 

432 01 TCCAAGAGAA GAAAGAGTTT 
43251 ATGATCAATG AGATTTTGGA 

433 01 GAAATTCCAG AACACTGGGG 
43351 AAGGCTCTCG CTGTTTCTTC 
4 3401 CGCATATGCA GAGAGTTTTC 

434 51 TCTGAGAAAG AATCACCCCG 
43 501 CTGCATAACT GAAGGAGTTC 
43 551 CGGAGAAGAG GAGGCGAAAG 
43 601 CAAAGACAAT ATCATTTCTG 
43 651 CCTACAGCAT AGGCACGATA 
4 3701 AAGGTAGTAA TTGTCATTTA 
437 51 CAAGTCGCCG GTGTTGTTGA 
43 801 AACTGTTTTA TTTGGAAATA 
43 851 GGAAGCTACG AAATGAGGGT 
43 901 TGGCAGGAAG AGGCAGAGGC 
43951 TTTAAGCATT GCCAATAGCC 
440 01 AGCCACCCGT ACGGGAGGAG 
44051 TATCAGCAGG TTGTTTAGGG 
44101 GGGGGAGGCG GGGGTGTGTC 
44151 ATCGGGTTTT TTGTCTTCAC 

442 01 TAGTTTTTGG CTCTGGTCCT 
44251 TCTTTTCTGA CAACCCGATG 

443 01 ACCAAGAGTA ATGATATGGA 

443 51 TAAGTAACAG AGGGTCTTTA 

444 01 TTGTTATTGG GCGGGCAATG 
444 51 ATTGCTCGTT TTTTAATCAA 



GCATCAGTAA GAAAGAGAGC GGGCAGATGT 
GTCTCTAGAT TGCATGTAAG GAAGTGCGGA 
ACAATTCCGA ATAGGGACGA TTTGCAGCAA 
AGGAGATCGT CCTGCATATC AGAAATAAAT 
ATTACTTGAA GCTGCGATTT GTTGGCGAAT 
TTAGAAAGGC TATCAGAGTT GCAGTGTGTT 
TCATAGAGGT CTATGAAGGA GCAATAAGTA 
AGCCATTTCT GCACAAAGAT TCGCATTTGC 
GCAGGTCAAG GAGACGTGTG GCTTCCTGCT 
CTGCTCTCTT CGTAGAGAGC AGATAGCCAT 
AAAGCAGTTC CCATCTCCCG GTACATTCAC 
GATAGAGAGC CTGTTCAAGA GAGAGTTGCG 
GGAAGATCCG GATTTTGTGC GATTTTCTTG 
CATCGGTTCG TTATCAATCC GATTGGGATA 
TAAAGTCGCC AAGTATTGGA TCAACTTGCA 
GCTCGTCTTA GTACCATGCT CACCATGCGA 
TTGACTAGAT GGGCGGAGAG GCATGGGGGT 
GAGGGGCCTC TGGTGGTGGA GTCGGCTTTT 
ACTTTTGGGC TCGCTGGTGA AGGAGCTTTG 
CTCTGGGGGC GGCGTGCCCG GCTTGGGAAC 
CATCCTTAGG CGGTTGTTTG GCAATTTCTA 
TTGGGAAGAG TGGGAGGCGT TGGCAAGCCT 
ATGCTTGTAG TAGTGAATCA GAAGAAGCAA 
GCAGAACGTA TCCTATGGTA CGTAGAATTC 
GTATCAGTCG TTAAGTGGTA AAAATTATTC 
TGGTGGGGAA TTGGATACAT ACGTTCAAAA 
AATTATTCAA AGTTAAAACT TTTTCGAGTT 

33 



44 501 TGATGTATTT ATTTTTTTAT GTAAACTTTA ACACATGATA GAATTTAGCG 

44551 TATAGAGCGC AAACTTTCAT GATAAAACAA ATAGGCCGTT TTTTTAGAGC 

44601 ATTTATTTTT ATAATGCCTT TATCTTTAAC AAGTTGTGAG TCTAAAATCG 

44 651 ATCGAAATCG CATCTGGATT GTAGGTACGA ATGCTACATA TCCTCCTTTT 

447 01 GAGTATGTGG ATGCTCAGGG GGAAGTTGTA GGTTTCGATA TAGATTTGGC 
44751 AAAGGCAATT AGTGAAAAAC TTGGCAAGCA ATTGGAAGTT AGAGAATTCG 
44801 CTTTCGATGC TTTAATTTTA AATTTAAAAA AACATCGTAT CGATGCAATT 

448 51 TTAGCAGGAA TGTCCATTAC TCCTTCGCGT CAGAAGGAAA TCGCCCTGCT 

449 01 TCCCTATTAT GGCGATGAGG TTCAAGAGCT GATGGTGGTT TCTAAGCGGT 
44951 CTTTAGAGAC CCCTGTGCTT CCCCTAACAC AGTATTCTTC TGTTGCTGTT 
45001 CAGACAGGAA CGTTTCAGGA GCATTATCTT TTATCTCAGC CCGGAATTTG 
45051 TGTCCGTTCT TTTGATAGCA CCTTGGAGGT GATTATGGAA GTTCGTTATG 
45101 GGAAATCTCC GGTTGCCGTT CTAGAACCCT CGGTAGGACG TGTCGTTCTT 
4 5151 AAAGACTTCC CTAATCTTGT TGCAACAAGA TTAGAGCTCC CTCCTGAATG 
45201 TTGGGTGTTG GGCTGTGGTC TCGGCGTAGC TAAAGATCGT CCTGAAGAAA 
45251 TACAAACGAT TCAACAAGCG ATTACAGATT TAAAGAGCGA AGGGGTGATT 
45301 CAATCTTTAA CCAAGAAATG GCAACTTTCT GAAGTTGCTT ACGAATAGAG 
45351 GGTATTCTTA TGGCAACCTC TGTTCCTGTA ACTTCATCTA CTTCTGTAGG 
454 01 AGAGGCTAAC TCCTCCAACG AAAGATTTAC TGAACGAACA TCGCGAATGT 
45451 ATTACGCAGC TTTAGTCCTA GGGGCTTTGA GCTGTTTAAT TTTTATTGCT 
45501 ATGATTGTCA TTTTCCCACA GGTCGGATTG TGGGCTGTGG TCCTCGGGTT 
45551 TGCTCTTGGA TGTTTACTTT TAAGCTTAGC TATCGTTTTT GCTGTCTCCG 
45601 GTCTCGTTTT AGGCAAGACT TTAGAACCTA GTCGAGAAGC GACTCCTCCA 
45 6 51 GAAATTGTTG CGCAAAAGGA GTGGACTACA CAACAAGATG TCTTAGGGAA 
45701 TGAGTATTGG CGTTCCGAGT TGATTTCCTT GTTCTTACGA GGGGATCTCC 
45751 ACGAATCTCT GATTGTTGAT TCTAAGGATC GATCTTTAGA TATTGATCAG 
45801 AGTTTACAAA ATATATTGAA ACTTGAGCCC CTATCTACGA CACTTTCGCT 
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45851 
45901 
45951 
46001 
46051 
46101 
46151 
46201 
46251 
46301 
46351 
46401 
46451 
46501 
46551 
46601 
46651 
46701 
46751 
46801 
46851 
46901 
46951 
47001 
47051 
47101 
47151 



GTTAAAGAAA 
AGTGGAACTT 
GAGGAACTTC 
TTTGAAATTG 
TGGATTGGGC 
AGCTGTCGCA 
TCTTCTTTTG 
TTCTTACCTA 
CATCTAGAGA 
CCGCTATGAA 
TCGATTTGAT 
GGAGAAAAAT 
TGCTTTTTCT 
GTTCTGTGGT 
AACACTGAAA 
ATATCCTTTA 
GCGGTACCGA 
CAAGGCTTGG 
ACAAAAAGTA 
TGCTTGTTCA 
GCTTTGATGT 
GCCTTTATTT 
GGGAATCTCT 
GCATGTTAGG 
TACAAAAATA 
CTATGTTGTT 
GTATCTTGCA 



GATTGTGTCC 
ACTGGGAGTG 
TACTCTTTTT 
ATTCGCTACG 
AGATTCAGGT 
GAGAAGAATG 
GCGTTGGAAA 
CATTTGGTCG 
GCTTGCAAAG 
GCACAAATAC 
AAACGCAATG 
GTTATGAGAG 
TCTTCTGTTC 
ACGGGTAGAT 
TCTTAGAAAA 
TCCTATTTGA 
AATCTCTCTA 
ATTCTATGTT 
TTGAATCCTA 
TGGCTTGGCA 
ATTTGACAGC 
GAATCTTTTC 
GGGAGACTAG 
TTCTTTGCCA 
GGTATTTCTA 
CCTGTTATTG 
GGTCTTGCGT 



ACATCAATAT 
GATCTTAGTC 
GATAGAAGAG 
GAGATGCTTT 
TCCTTTAGTG 
TTCTCCTGAG 
ATCCCGACAG 
TCTTCATTTT 
AAAGCTCCCA 
AAACATTTCT 
TCCTTAGATT 
CGCAAATCAA 
CTGCTATGAA 
CGTAGGCAGA 
TGAGTCAGGG 
TAGATTGGGC 
GAAGATCAGG 
ATCTCAATTT 
GAGATGTTTT 
GCACAGGGCG 
CGTTCCCCAA 
CTGTCTTTAA 
GTGAATTTGT 
TGTTATCCTG 
TTGTCAGTTA 
TAGTTGATGT 
TGTAAGCAAC 
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CATTTTACAT 
CTGAAGTCAC 
CAGTATTACT 
ACAAGCAACG 
TAGACGCAGA 
GATGCTTTGG 
ACGCTTCTTA 
TTGAGAAGTT 
GAGACAGCGA 
CTCTCGCTAT 
GGGGATATAA 
AGATTAGACA 
GCGGCTCTTT 
TTCGTGAGCA 
TTCCTCTGCA 
TGTTTTGCTA 
CCGATTACAC 
GCGAGTCGTT 
AAGTGAACAG 
TGTCGTTTCA 
AGAATGTGGT 
TCGGATGAAA 
ATCAAAGAAG 
GTGCTGGCAA 
TGTGCTGAGG 
GCAAGGGGCT 
ATAAATTTCA 



TTAGTGAGAC 
TGCGCACGCC 
CTCCTGATAT 
TCTCCTTTGA 
CGGGGTATTT 
CGCAATTCGA 
AAGGATTCTT 
TTTACATCGC 
TCGATGTCGC 
TTTCAGAAGC 
CTGTGCTGAG 
ACCTATTTAT 
GACAAATATG 
GATTGTTTCG 
GTTTGTATGA 
GACTGTGTTC 
CGTTTGTTTG 
TACAGTCTGG 
GCTGCGGTTA 
AGGATTGAAA 
TAGGAGCATT 
GAATTTCTTG 
GAACAAGATT 
TATTGAAGAA 
TCGTTAGTCC 
CCTCCTACAG 
AGGCCTACCC 



47201 GTACATGGCC CCATTACTTC TTTATGGGCT TTGGAGCCCG TGGGTAAGGG 

47251 AGCTCCGCAG CTGGAGTCTG CAATGTACGA GCTCTGTTCT CAAGTAAGGA 

47 3 01 ATTTTGACAT CTGCTCTATT GTGAGTTGGG TCTTTGGTGG GTTGTGTATT 

47 3 51 TTTGCAGGTC TGATTGTCGG GGTAATGGTT GAAGCCCCTT TGATTGCGGG 

47401 ATTAAGTGCT TGGGTGATTC CCTGTATCAT TGGAGGGGTT GGTGCCATTT 

47451 TATGCTTGTT TGCGATCTTG ATGGCGTACT TGGGAAGAGG GAGAGTCCGT 

47 501 GAGTGGCTCA ATCTTTCACA CGAATATATA ACGCAATGTC ATTGTCGTCA 

47 551 GATACAGGCA CATTCTCAAA ACTATTCTGT GATCACAGAG TATCCTGCAA 

47601 CCTGTGCATT ATCTCAACCG ATTACAAAGT TACCTAATGG ATCACGCAGA 

47 651 GATAACTAAG CGTGTTCGTC AGTTATTTCT CACATTTTCT CATGAATCTT 

477 01 TTACTGCGCT GCACGAGATC CCTCTCGAAA ATTTTTAAGG ATAGATACTT 

47751 GGAAACTATG GTTTAAAAAG CTATAGAGGA TTCTAAATTG GGGTTCTAGC 

47801 AACTTCTTGA CTTTAAGATC CAAAGTTAAG AGACTGACTA ATTATTTTTG 

47851 TTTGCTTGTG TTTCCAGATG AGCAATTGGT ATGGTAAGAG ATATTCAGAG 

47901 TGAATCTATA GGGAAATTAG TATTTTTAGG CACAGGAAAT CCCGAAGGAA 

47951 TTCCCGTGCC GTTTTGCTCA TGTAGAGTGT GTCAAAACAC AGGGATTCAT 

48001 CGTTTACGAT CTTCGGTACT CATTCAATAT CAAAACAAGA CTCTAGTGAT 

480 51 TGACGCAGGC CCTGATTTTC GTACGCAGAT GTTAGTTGCA GGGGTTTCCG 

48101 AGCTCGATGG GGTATTTCTG ACCCATCCCC ACTACGATCA TATCGGTGGT 

4 8151 ATTGATGATT TACGTGCGTG GTACATAGTC ACGCAGCGTT CGTTGCCTTT 

48201 GGTCCTTTCT GCAAGCACCT ATAGATTTTT AAACAAGGCT AAAGAGTATC 

48251 TCTTCGCCAC TCCGAATGTA GAGTCTTCAC TTCCCGCAGT TTTAGAGTTT 

48301 ACAATCTTGA ATGAGGACTG TGGGCAGGAG GAATTTCAGG GCATTCCCTA 

48351 TACTTATGTT TCCTATTATC AAAAGTCGTG CCATGTAACG GGTTTTCGTT 

48401 TTGGAAATCT TGCTTATCTT ACAGATCTCT GTAGCTATGA TGCAAAAATT 

484 51 TTCAGTTACT TAGATAATGT AGAGACATTG ATCTTGTCTG CGGGTCCATC 

48501 GGAAACTCCT ATTCCTTTTC AGGGACACAA ATCTTCGCAT CTTACTGTAG 
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48551 AAGAAGCCAA AGCTTTTGCG AATCATGCAG GGATAAAGAA TTTAATTATT 

48 6 01 ACACATATCA GCCACTGTTT AGAAGCAGAG CGTGACCAGC ATCCAGAGGT 
48651 CACATTTGCT TATGATGGCA TGGAGGTCCT TTGGACACTA TAGATACGCC 
4 8701 CGGGGAACAG GGTTCTCAAT CTTTCGGAAA TTCGTTAGGG GCCAGGTTCG 
48751 ACTTGCCTCG TAAGGAACAG GATCCCTCTC AAGCTTTAGC TGTGGCTTCC 
48801 TATCAAAATA AGACAGATTC TCAGGTCGTT GAAGAACATT TAGACGAGTT 
48851 GATCTCACTT GCGGATTCCT GTGGTATTTC TGTTTTAGAG ACCCGTTCTT 
48901 GGATTTTAAA AACACCCTCA GCTTCCACCT ATATCAATGT GGGGAAGTTG 
48951 GAGGAGATCG AAGAAATCTT GAAAGAGTTT CCCTCTATAG GGACTTTGAT 
4 9 001 CATAGATGAG GAGATCACTC CATCCCAACA ACGGAATTTA GAGAAACGCC 

49 051 TTGGCCTTGT CGTTTTGGAT AGGACGGAGT TAATTTTGGA AATCTTTTCC 
49101 AGCCGTGCCC TTACTGCAGA GGCAAATATC CAAGTCCAAC TTGCACAAGC 
4 9151 ACGTTATCTC CTTCCTCGTC TTAAGAGACT TTGGGGGCAC CTATCTCGGC 
49201 AAAAATCTGG GGGAGGTAGC GGAGGCTTTG TTAAGGGGGA AGGAGAAAAA 
49251 CAGATCGAGC TAGACCGTAG AATGGTCCGT GAGCGTATCC ATAAGCTGTC 
4 9301 AGCACAGCTG AAAGCTGTGA TCAAACAGCG TGCGGAACGC CGTAAAGTAA 
49351 AATCTCGACG AGGAATTCCT ACCTTTGCTT TGATAGGGTA TACAAATTCA 
49401 GGGAAGAGCA CCCTATTAAA TTTGCTGACG GCTGCTGATA CGTATGTTGA 
49451 AGACAAGCTA TTTGCAACTT TAGATCCCAA AACGCGCAAA TGCGTACTTC 
49501 CAGGAGGCCG TCATGTCCTT CTTACTGATA CTGTAGGCTT CATTCGAAAA 
49551 CTTCCTCATA CTTTGGTAGC AGCATTTAAA AGTACTTTAG AAGCAGCTTT 
49 601 CCATGAAGAT GTTCTTCTGC ATGTTGTCGA TGCTTCGCAT CCTTTAGCTT 
49651 TAGAGCATGT ACAGACGACC TACGATCTGT TTCAAGAGTT GAAGATTGAA 
49701 AAGCCTAGGA TCATTACTGT GTTGAATAAG GTAGATCGGC TTCCTCAAGG 
4 9751 AAGTATCCCT ATGAAATTAC GTTTGCTCTC TCCTCTTCCT GTATTGATTT 
49801 CAGCAAAAAC TGGGGAGGGG ATCCAGAATC TTCTTAGTCT TATGACGGAA 
49851 ATCATTCAGG AGAAAAGTTT GCATGTGACT TTGAATTTTC CTTATACAGA 
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49901 ATATGGAAAA TTTACGGAAC TTTGCGATGC CGGGGTTGTG GCCTCGTCAA 

49951 GGTATCAAGA AGATTTTTTA GTTGTTGAAG CGTATCTTCC TAAGGAGCTG 

50001 CAAAAGAAAT TTCGTCCTTT TATTTCTTAT GTTTTCCCTG AAGATTGTGG 

50051 AGATGACGAG GGTAGAGGGC CCGTCTTGGA GAGTTCTTTC GGGGATTAGG 

50101 TAGTTTTCTT CTAGGACATC GAATCTTTGT TAGTGAGAAA AAGAGTGATA 

50151 TTTTAAAATA GCCACTCATC GCTAAATCTA TTGAAGTCTC TAGAGGTATA 

50201 TGACGGTTGC GG AAGTCAAA GGAACATTTA AGCTGGTCTG TTTAGGCTGT 

5 0251 CGGGTGAATC AGTATGAGGT CCAAGCATAT CGCGACCAGT TGACTATCTT 

503 01 AGGTTACCAA GAGGTCCTGG ATTCTGAAAT CCCTGCAGAT TTATGCATAA 

50351 TCAATACGTG TGCTGTCACA GCTTCTGCTG AGAGTTCGGG TCGTCATGCT 

50401 GTGCGTCAGT TATGTCGTCA GAACCCTACA GCACATATTG TTGTCACAGG 

50451 TTGTTTGGGG GAATCTGACA AAGAGTTTTT TGCTTCTTTG GATCGGCAAT 

50501 GCACACTTGT TTCCAATAAA GAAAAATCCC GACTTATAGA AAAAATTTTT 

50551 TCCTATGATA CGACCTTCCC TGAGTTCAAG ATCCATAGTT TTGAGGGAAA 

5 0601 GTCTCGAGCT TTTATTAAAG TTCAAGATGG CTGTAATTCT TTTTGCTCGT 

50651 ACTGCATTAT TCCTTATTTG CGGGGGCGTT CGGTTTCTCG TCCTGCTGAG 

50701 AAGATTTTAG CTGAAATCGC AGGGGTTGTA GACCAAGGAT ATCGCGAAGT 

50751 TGTAATTGCA GGAATTAATG TTGGAGATTA TTGCGATGGA GAGCGTTCAT 

508 01 TAGCCTCTTT GATTGAACAG GTGGACCGGA TTCCTGGAAT TGAGAGGATT 
50851 CGAATTTCCT CTATAGATCC TGATGATATC ACTGAAGATC TGCACCGTGC 
50901 CATCACCTCA TCGCGTCACA CTTGTCCTTC GTCACACCTT GTTCTTCAAT 

509 51 CGGGGTCGAA TTCAATTTTA AAGAGAATGA ACCGGAAGTA TTCTCGCGGA 
51001 GATTTTTTAG ATTGTGTAGA GAAGTTCCGT GCTTCTGATC CTCGCTATGC 
51051 CTTTACTACA GATGTGATTG TCGGATTTCC TGGAGAGAGT GATCAAGATT 
51101 TTGAAGATAC TTTGAGAATT ATTGAAGATG TAGGCTTTAT TAAAGTGCAT 
51151 AGTTTCCCTT TCAGTGCTCG TCGTCGTACT AAGGCATATA CTTTTGATAA 
51201 TCAGATTCCC AATCAGGTGA TCTATGAGAG GAAGAAGTAT CTTGCTGAGG 
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51251 TTGCTAAGAG GGTAGGCCAG AAAGAGATGA TGAAGCGTTT AGGAGAGACT 

513 01 ACAGAGGTGC TTGTTGAGAA AGTAACGGGG CAGGTTGCTA CGGGTCACTC 

51351 TCCTTATTTT GAAAAGGTTT CTTTCCCTGT TGTAGGAACG GTAGCTATCA 

51401 ACACTCTAGT TTCTGTGCGT CTTGATAGGG TAGAGGAAGA AGGGCTGATT 

51451 GGGGAGATTG TATGATAGAT ATAATGCAAC ATTTTAAGCC CTATACTATG 

51501 GTCCCAGGAC .AAAAACTCCC TATTCCTGGA TCTTTGTTAT ATGCTCAGGT 

51551 ATTTCCTACC CTGTGGCGTC TATTTTCTTC GAAACACGAA ATCTTAAATG 

51601 AGCAGACCTT ACAGGTGCAA GGGCCTTTAA AACGCTTTGC TGTTTTCCAA 

51651 GATTTACATC GTGGGGGGCT TGCAGTGACT TCTGAGCGCT ACAAGTATTA 

51701 TCTCCTTCCC TCGGGAGAGT GCACACAATC TATCAAAGGG AAACTGCCTT 

517 51 CGGCAGCGCA AGCAGGGCCC CTGTTATCTC TTGGGGTGCA TAAGCATGCA 

518 01 GATTGGCAAA AGGTCCGTTG TCGTCGTGAT CTTAAAGAAA TTCTTCCCCT 
51851 ATGGTTCCGT TTCGCCGCTA TGGCTCCTAA GGGATCCTAT CGGGATCTAG 
51901 AGACGACGGC TATCGGTAGC TTGGTAAAGA CTGCCCATCA AAGAGTTTTA 
51951 CATAGGGAAA CTACAGAGAT TGCTCCTGCG TTACTCTCCA TAGCCCTTGC 
52 001 GGGATTTTCA GAGTGCTTTC TTCCTAGGAG CTATGATGAA GAGTTCCAAG 
52 051 GAATCCTCCC CCAAGATGGA GATCCAGAGG GGGGAGTTCC TTTTGAGCTT 
52101 CTCTCGTATA GCTTTGGTAT GATCCAAGAT ATTTTTCTGA GACACCAGGG 
52151 ACAGCTAGTA GAGATCCTTC CTGCATTACC TCCTGAATTT CCTTGTGGCC 
52201 GCTTGATTCA TGTTGCCCTT CCTAATCTTG GGACTTTGTC TATCGTCTGG 
52251 ACTAAGAAAA CTATCCGTCA GGTCGAGCTC CATGCAGAAT ATAGTGGCGA 
52301 GGTATTTTTA AAGTTTTGTT CTTCACTATG CAGTGCGCGC CTTCGGGAAT 
52351 GGTCGGAGCG ACGTCTCTCT GGATCTAAGA GACTTTCTTT AGGAGAAACT 
52401 CTGGAGATAA AAGCAGGAAC CACATATTTA TGGGATTGTT TTCATAAATA 
52451 GATAGCCTTC CATGGTTGAT AAACTGATCC ATCCTTGGGA TCTTGATCTG 
52501 CTCGTCTCAG GACGACAGAA AGATCCCCAT AAACTCTTAG GGATCCTTGC 
52551 TTCTGAAGAT TCTTCAGATC ATATTGTTAT TTTTCGTCCA GGGGCGCATA 
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52601 CGGTTGCTAT TGAACTTCTA GGAGAGCTTC ACCACGCTGT AGCTTATCGT 

52651 TCGGGGCTCT TTTTCTTATC CGTTCCCAAA GGAATCGGAC ACGGGGATTA 

527 01 CCGTGTGTAT CATCAGAATG GACTTCTCGC TCATGATCCC TATGCGTTTC 

52751 CTCCTCTGTG GGGAGAAATT GATTCTTTTT TATTCCATAG AGGAACGCAT 

52801 TACCGCATTT ATGAACGCAT GGGGGCAATC CCTATGGAAG TTCAAGGAAT 

52851 CTCAGGGGTG CTCTTTGTTC TTTGGGCTCC CCATGCGCAG AGAGTCTCTG 

52901 TAGTCGGAGA TTTTAATTTT TGGCATGGCC TTGTCAATCC TCTACGTAAA 

52951 ATTTCCGATC AGGGGATCTG GGAGCTTTTC GTCCCAGGCT TGGGAGAGGG 

53 001 AATACGGTAT AAGTGGGAAA TCGTTACCCA ATCGGGGAAT GTGATTGTAA 

53 051 AAACAGATCC TTATGGGAAG AGCTTTGATC CTCCACCCCA GGGTACAGCT 

53101 CGTGTTGCGG ATTCTGAGAG CTACTCTTGG AGTGATCATC GTTGGATGGA 

53151 GAGGCGCTCG AAGCAGAGTG AAGGGCCCGT CACGATCTAT GAAGTGCACT 

532 01 TAGGCTCTTG GCAATGGCAG GAGGGAAGGC CCTTAAGCTA CAGCGAAATG 
53251 GCGCATCGCC TTGCTAGCTA TTGCAAGGAA ATGCACTACA CTCATGTGGA 

533 01 GCTTCTTCCC ATTACGGAGC ATCCCCTGAA TGAATCTTGG GGCTATCAAG 
53 3 51 TGACGGGATA TTATGCTCCA ACATCAAGAT ACGGGACTCT CCAGGAGTTT 
53401 CAGTATTTTG TAGACTATCT ACATAAAGAA AATATTGGTA TTATTTTAGA 
53451 TTGGGTGCCG GGACATTTTC CCGTAGATGC GTTTGCTCTT GCCTCTTTTG 
53 501 ATGGGGAGCC TCTCTACGAG TACACGGGGC ATAGTCAGGC TCTTCATCCC 
53 5 51 CACTGGAATA CGTTTACCTT TGACTACAGT CGTCATGAAG TGACCAACTT 
53 601 TTTACTAGGG AGTGCTTTAT TTTGGCTCGA TAAGATGCAT ATTGATGGCT 
53 651 TACGTGTGGA TGCTGTGGCC TCTATGCTGT ATCGTGATTA TGGCCGTGAA 
537 01 GATGGAGAAT GGACGCCTAA CATCTATGGA GGTAAGGAGA ACTTAGAGTC 
537 51 TATAGAATTT TTGAAACACT TAAATTCTGT AATTCATAAG GAGTTCTCTG 
53801 GAGTGCTCAC CTTTGCAGAG GAATCCACAG CGTTTCCAGG AGTCACTAAG 
53851 GACGTAGATC AGGGAGGTCT GGGGTTTGAT TACAAATGGA ACTTAGGTTG 
53901 GATGCACGAT ACCTTTCATT ACTTTATGAA GGATCCCATG TATCGTAAAT 
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53 951 ACCATCAGAA AGATCTGACA TTTAGCCTTT GGTATGCCTT CCAAGAGTCT 

54 001 TTTATTCTTC CTCTCTCGCA TGACGAGGTG GTCCACGGTA AGGGCAGCTT 
54051 AGTGAATAAG CTTCCCGGGG ATACCTGGAC CCGATTTGCT CAAATGAGAG 
54101 TGCTCTTGAG CTACCAGATC TGTTTGCCTG GGAAAAAGTT ACTGTTCATG 
54151 GGTGGGGAAT TCGGACAATA CGGCGAGTGG TCTCCTGATC GTCCCTTAGA 
542 01 TTGGGAGCTT TTGAATCATC ACTACCACAA AACTTTGCGA AACTGTGTCT 

542 51 CTGCATTGAA TGCGTTGTAT ATTCACCAAC CCTATTTATG GATGCAAGAG 

543 01 AGCTCTCAAG AGTGCTTCCA TTGGGTAGAC TTCCATGATA TAGAAAACAA 

543 51 TGTCATTGCC TATTATAGAT TTGCAGGCAG CAATCGTTCT TCGGCGCTTC 
54401 TCTGTGTCCA TCATTTCAGT GCGAGTACTT TTCCTTCCTA TGTTTTAAGG 

544 51 TGTGAAGGTG TAAAGCATTG TGAACTCCTT CTCAACACTG ATGATGAGTC 
54501 TTTTGGAGGC TCAGGGAAGG GAAATCGGGC TCCTGTGGTC TGTCAAGACC 
54 551 AAGGGGTCGC TTGGGGTTTG G AT AT AG AG C TCCCTCCTTT AGCTACTGTG 
54601 ATCTATTTAG TTACTTTTTT CTAAAAATTT AAATACTTTA TTTGTAAATT 
54 651 GTTGTGGGAT TGTTCTATTT TGTGGTGTAG TTGATATTAA TAATTTATTT 
54701 TATAATTAAA AATAATTATT AGTATTTCTT TTATGTCTAC ATCACCAATT 
54751 AGCAACGATC CCCGATATTT GTCTTTGTCT AATGCAACTG AGAAAACTTC 
54801 TCTTCTTGCA AATAGCCGGA GTCTCTCGCC AGTACCAAAT TCCCTAGTTC 
54851 CTAGCAATCC TGAAGATACA GGATTGCGAA AAAGTATTTT CACCCATTCC 
54901 GTGACTTTAT TTGCTGGCCT GGTTGTTTTG CTGGTAGCGG TTTCTGTTGT 
54951 TGTTGTCGCT TTGACCGTCT TAGCTCCCGG AGTTCCTCAG GCTATTCTTC 
5 5001 TTGGAATCGC CATTTCAGGC GTGGGTATTG GTGGATTTTC TATAATGAAG 
55051 AGCTTGGTTT ATATGGTCCG AGACTATATG TCCCCCAGGA TGCAGGAGTC 
55101 GAGCAGAATC AAAAGTGCTT TAGCTGTAGG GACTGGATTT ACTGTCATGG 
55151 GTTTGGTCAT GXAGGTGGGG GCGAATTTTG TTCCTGGAGG GTATGGGGGT 
55201 CTCGTGGGTA GCTTGGGATC CAGTGCGTAT TCCCGGGGAA GCCAAACCAC 
55251 ATTAGCAAGC TTCAGTCATT ATATTTATAC TAAGTTTTTC CGTTCTGAAA 
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55301 AAGTTGCTAA AGGGGAGAAG CTTACAGAAG CAGAAACTAT AAAAGAGGCG 

553 51 AAAAAATTAC ACTATATCAC GTTGTCAATT GCCACTATTG GCGTTGGTCT 

55401 TGCGGTTTTG GGGATTCTCC TTGCCATTGC AGGAACGGTA TTGCTAGGAG 

55451 GCGCTCCCGC AACGATTGCT ATTATTTTAG CTCCCCCTTT AATTTCTATA 

55501 GGGCTTACGA CGGTTTTGCA AACGATACTC CATAGTAGTA TCGGAAAGTG 

55551 GAGAGCCTTT CTGCTTACTC AAGAAAAAAA AGATCTTTTT GTAGACACCT 

55601 CCCTGAAAGA CATTCGCTTA GAAAAATTGC CCCCCAGTGA GGTGGAAGAG 

55651 AGTGAAACTT CCCAATCTGT GATAGAAGTT CCAGATTCAG AGGGGATTGC 

55701 AGAGACGAGG ATCTCTGCGG AAGAAATCGA TACGAGGCTT TCCCTGACGA 

55751 CAAGACAGAA GGTCATCTTT GCTCTTGCGA CACTCTTGCT CTTAGCAAGT 

5 5801 ATTGCTGCCT TCATAGTCAC GGGATTTGGT GGATTGACAG TCATGCAAGT 

55851 TCTCCTTGTT GCTTCTGTAG GATCGGCGGT TGCTTCTGTA ACACTCCCTA 

55901 TGGTTTCCTC AGGATTTTCC TACGTCGCCT ACCAACTGAA AGCAAGATTG 

55951 AATATCAGTA AATTACGTTG GAAAGAAGCA AAAAATAAAA AGCGGGTGCG 

56001 CCAGTTCTTA ATTGAGTCTG GAGTGATTGC CTCGGATCGA GAATTTAACC 

56051 AAATGTGGAA GACAGTCTAC AAAAAACAGA TTCAGAAGAC TGACGCTGCA 

5 6101 ATTCGTGAAG AGGTTCGCAA TTTTGAGAAG GGTGGGGAAG TGAACAGCGC 

5 6151 CCTTGTTGGT GGAATCTTAC TTGGTGTAGG AACTGGGATC ATGCTTCTTG 

5 6201 CCCTGGTCCC TGCATTTGCT CCTATCGTTC CTGGTATTCT TGCTCTTGGA 

56251 GGATCGACGT TAGGAATCGC GGGATCGATT TTAATGAGGA AGTTTGTCAA 

56301 CTGGCTCTAT GATGAGCTTG TGAAGCTCTA TGAGCGTCGA CGTAATCGCC 

56351 GTGAGCTTCT CTATGGTCCT GAAAGTAAAA TGCGCTCCAT TGCTACGGAT 

56401 TTAGTTGTTG AGGCTCTTGC TGCTAGCCAC GATCATCTAT TTGATCTTGA 

56451 TGGTCCCGTA GATTTTATTG ATGTGGATGT AGATATAGAT GGAGCTGCTT 

5 6501 AGGCCAGGTC CTTGAATGTA AGATCCTCGA GCTTTGGGAG CTTGTCTGCC 

56551 TTCTCTAATT TTATTTTCTC TTTTACATCT AGTAGTTTAC TTAATTATAT 

5 6601 AATCAGCATT CTTTTTGTTT ATTTTAATTT ATATTTTGTT TTTAAAATAT 
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56651 TTTTTATTTT AACATTTGTT 

5 6701 TAAGGTTCAA TTATGGCAGT 

5 6751 CCCCATTCCT CCTAATAGAA 

56801 AAGACAACTT AGGGGAACAT 

56851 AGTCAGGGCC CTACAATAGA 

56901 TAAAATTCCT TTGCCATCTG 

56951 GACGTTCTGG GGTACTTCAG 

57001 AAAAAAACCC CTCAAGCGCG 

57 051 CCATGTGCAA CATGGCCAAC 

57101 GTATCCAGAA AAGATCTGAA 

57151 CGTTCCTATT CTGATGGTGA 

572 01 AGATTCTACA GAGGATAGCC 
57251 GTTCTTCCTT CTTATCAGGA 

573 01 GCCCTAGGTG ATATTAAAGG 
57 3 51 TTTAACAACT CAGGGCGAAG 
57401 GTTCCGAAGA AGCAGAGGCT 

574 51 GTTCGAGGAG CGACGTCTAC 

575 01 GAAGGTTTCG GCGTTCGGAG 
57551 CAGGGAATAT CAGAACTAGA 
57 601 TCTAATGTGA ATAAAGCAGC 
57 651 GGAAAAAGTA GCTCCAGAAC 
577 01 AATCTCTTCT TGCACGCATG 
577 51 GTGGAGGATC TTATTACTTT 
57 801 GGAGTATGCA TCCATCGTAC 
57 851 CTGCGGAAGC TCCCGAAACA 
57 901 GCATGGAAAG CGTTACGGGA 
57 951 GAGCTTCTTT AGGGCAATTG 



TAATAGTTTT TATTAAATAA TTTATCATTT 
TGGTGGCGTA GGCGGCTCAA GATCTCCTTC 
GGAATAGTGA GGATGGAAAA GTAAGTCCTA 
ACAGTTAGCA GTAGTGACAG TAGTCTTGCA 
AGAGAGAAAA GCCCAGTTAG GCGGGACTGA 
TCAAAGAACC CGGAGATTCT CAAACTTCAG 
AGAATTTGGA AAGGCGTTAA AGGGGTCTTT 
TCCTGAAGTT TCTAGTCCAC GTCTTCCATC 
GTCTTCCTGG ACTCGAGGGC TTTAGAGATC 
AATCCAGAGG CAGATTTAGG GAAGATGAAA 
CCTTGATCGA GTAGGACACG ATTCTAATGA 
GTTCTGAAGG AGGAGAGCCT TCTTCAAAGA 
GTTCGAGGAG CGGTGTCTAA AGTTCATGGT 
AAAGTTCCAG CGTTCTGCTT CCGAAGATGA 
ATTCTGCCGG CGATACTGTA AAAGAAAGGC 
TCTTCGAAGA GTTCTTCTTT TTTATCAGGA 
AGTTCAGGGA GCCTTAGGTG ACGCTAAAGA 
AGCAGGCTGC AGGTGCAATC AGATCAGCAC 
TTCCAACGTT CTTCATCGGA AGGTGATCTT 
AAAACATCTG CGTAAGGCTT TAGAAAATTT 
AAGTGTCACC AGAGGTGGCT TCTAGGGTGC 
GAGCAATTGA CTCATCAGGA ACCTCCTACT 
CGTAGAATCC AATGTAGGTA GTGATTCTGT 
CTCAAGATGG ATCGCAAGCC CCAGCAGAGA 
GGTGGGGTAG AGGGATCTGC AGCGCAGGGA 
TTTTGTAGTT AGCATATTCC AAGCGGTAGC 
CTTCAAGATT AAGTTCAGCA CGACGTGAAT 
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58001 CAGCTGTAGA TGATCTTGCA 

58051 CAAGAGGGCG TTTCAAATCC 

58101 AGAGATCGCT CGTAGAGCTG 

58151 TTGAAAAATT GGAATCAGGC 

58201 TTAGGATTAG CTAGATCATT 

58251 CTGTTTCTCT ATGCGCAAGG 

58301 CAATATGTTG ATTATTTTAA 

583 51 ATAAAATTAA ATTGTTTCAG 

58401 ATGGCATCAG GAATCGGAGG 

58451 AGATAATGGG GATAGAAGTC 

58501 GCCACGAGAT TTCCCTGCCT 

58551 GGATCTTCGC ATATACATAG 

58601 GTCTCAGAGC TCTTCTTCGG 

58651 TACGTTCTGG GGTAGACAGG 

58701 GCAGAGTCTA CGAGTCAAGC 

5 8751 ATCAAAAACC ATCACCGCGG 

58801 CTGCTGCTAC AGAAGCCCGA 

58851 AATCCTTCTC AGGGGGTTCC 

589 01 TTTATTTTCT CTTCCTTCAG 

58951 TACAGACAGT TCGCGATCGC 

59001 GACAGCGAGC CTTTAAGTCT 

59051 ACGTCAGGAG CTCTCTGACA 

59101 AAAAAGCAGA AGCCACAGTT 

59151 TTCCAATGCG GCTATATGGA 

59201 AGCTCGTTTT AAGGGGGTAG 

59251 CAGAACTGAC AGATCCTGAG 

593 01 CTTCAAAACC TATTAGATGA 



TCAGAAAGTA ATACACAATG GTTTGTGGAG 
ATCGGCTGCA CCTAGCTTAT CTTTTGCGGA 
CAGAAATGAG TAACAGAAAT GCCCAGAGTC 
AATGTGACTG ATCCTGTCAT TCAACAAGGC 
TGCTCCAGAG GGACAGTAGT CGTTATCTCA 
GAAACTTGAA GAGTTTTAAT TAAAACTCTT 
TATATTTAAA AGCATTTTTG TTGTTTTTTA 
AAAAAAGATT ATTCTTTTTA GGAAGTGTTT 
ATCTAGTGGA TTAGGAAAGA TTCCACCTAA 
GATCGCCCTC TCCTAAGGGA GAACTTGGCA 
CCTCAAGAAC ATGGAGAGGA AGGAGCTTCA 
CAGTTCCTCT TTTCTACCAG AAGATCAGGA 
CAGCTTCTAG CCCGGGATTT TTTTCTCGCG 
GCCTTAAAAT CATTTGGCAA CTTTTTTTCC 
GCGTGAAACG CGACAAGCTT TTGTTAGATT 
ATGAGAGACG GGATGTCGAT TCATCAAGTG 
GTGGCAGAGG ACGCGAGTGT TTCAGGCGAA 
AGAAACCTCT TCTGGACCAG AACCTCAGCG 
TAAAAAAACA GAGCGGTTTG GGTCGGTTGG 
ATAGTACTTC CTAGTGGGGC TCCACCTACA 
CTACGAGCTA AACCTCCGTT TGAGTAGTTT 
TACAAAGTAA TGATCAGTTG ACTCCAGAGG 
ACCATACAAC AGCTGATCCA AATTACAGAA 
GGCAACACAA TCTTCGGTAT CTCTAGCAGA 
AAACTAGTGA TGAGATCAAT TCCCTCTGTT 
CTTCAAGAAC TCATGAGTGA TGGAGACTCT 
GACTGCCGAC GATTTAGAAG CTGCTTTGTC 
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59351 CCATACTCGA TTGAGTTTTT 

59401 ATAATCCAAC TCTGATTTCT 

594 51 GGAGCTGCAG ATCCTCAAAG 

59501 GAATCAGATT CGCGAGGCTC 

59551 TTCTAGGGTC CATCTTGCAC 

59601 GAAGCAGTGG GTCGTTGTTG 

59651 TGAAGAGGAC TCGATGTCGG 

59701 AAAGAACGGG CTCTCCGCAT 

59751 GAAGATTCTC CATTGATGAA 

59801 TGCTAAAACC AAGGAGAGTT 

59851 CTGCTCCCAT AGTGAGAGGT 

59901 GTTATGGAAG AT GAT CAT AT 

59951 AATCTATGAC GTTCCTAGTT 

60 001 AAGAGGATGT TTTTGGAGAT 

60051 TCTAAAGACA AGAACATCTA 

60101 CTATGATCTT CCTTCACGTC 

60151 CTTCAGATCG CGTACGAAGC 

60201 CCTCCAGTTC CTTCACCTGC 

60251 TATGAGCGGT GCTTCAGGTG 

60301 GTTCCCCCTC TCCTAGAGGC 

60351 CCTGAAGATA ATCCATTTAC 

604 01 GAGGTCAGGC GGTGCTTCCG 

604 51 TCCCATGGAT 'TCATGGCAGG 
60501 ACATTGACTA ATGTTTCGCT 

605 51 AAGAGCCGCT TTGCTTAGCG 
60601 AGAGTATTGT TCCTCCAACA 
60651 GAGCCCTTAG GGGGACTTGT 



CTTTAGACGA TAATCCAACT CCGATAGACA 
CAAGAAGAGC CTATTTATGA GGAAATCGGA 
AACTCGGGAA AACTGGTCTA CAAGATTATG 
TGGTTTCTCT TTTAGGAATG ATTTTAAGCA 
AGGTTGCGTA TTGCTCGTCA TGCAGCTGCT 
CACGTGCCGA GGAGAAGAGT GTACTTCTTC 
TGGGGTCTCC TTCAGAAATT GATGAAACTG 
GACGTTCCAC GCAGAAATGG AAGTCCACGT 
TGCCTTAGTA GGATGGGCAC ATAAGCACGG 
CAGAATCAAG TACCCCGGAA ATTTCGATTT 
TGGAGTCAAG ACAGTTCCGT CAGTTTTATT 
TTTCTATGAT GTTCCTCGTA GAAAAGATGG 
CCCCTAGATG GAGTCCTGCG CGAGAGTTGG 
TATGAAGTTC CTATAACCTC TGCTGAACCA 
CATGACACCT AGATTAGCAA CTCCTGCTAT 
CAGGATCGTC TGGAAGCTCA CGTTCTCCGT 
AGCTCACCAA ATAGACGGGG TGTGCCTCTT 
TATGAGTGAG GAGGGGAGCA TTTATGAGGA 
CAGGTGAAAG TGATTATGAA GATATGAGCC 
GACTTGGATG AACCCATATA TGCTAATACT 
TCAGAGAAAT ATAGATAGAA TTTTACAGGA 
CTTCTCCTGT AGAGCCTATT TATGATGAGA 
CCCCCTGCTA CACTTCCAAG ACCCGAGAAT 
TAGAGTGAGC CCAGGGTTTG GACCAGAAGT 
AGAGCGTGAG TGCTGTTATG GTCGAAGCAG 
GAGCCGGGGG ACGGAGAATC AGAATATCTA 
AGCTACAACG AAAATCTTAC TACAAAAAGG 
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6 0701 ATGGCCTCGT GGAGAGTCGA ATGCTTAGGA TTTAAGTAGT TCTTTCGAAT 

607 51 CTCTAGTGAG GATGTATCGG GTTCTTAATT TTTATGGGGG AAACGTATCT 

60801 GTGGATTTCC CTTAGCTTCT CCCATAAGAT TCATGATGGT AGAGAGTGTT 

60851 CACTACTCTC TATTTGGTCT TTACAGGTTG CATTGTCTAT ATAACATGCT 

6 0901 TTTAAAACTT AAAGGTCGTT CCTGCGTGAA GGTAATGTGT CGTCGTTGAT 

60951 GAGGATACCG AGCCTTGATA GTCTAAGAAC ACCGAAAGTT TAGGGAAGAT 

61001 AAAAATTTGG TTTCTTCCTT TAAAAGCAAT GGCATTGCGA GCAAGGGTGG 

61051 TTCCTGATAG GAGCCATGAC GATCCACTAG ATTCTAGACT CACGTTGATC 

61101 TCAGGATTTT GTTGGTAGAG GACAGGCTGA TAAGCAAGCT CTATGTTCCA 

61151 ATAGGTAGGA AGACGGAACT TGGATTCCCA AGCGCTCTGA ATTCCCAGAG 

612 01 GGACTGTCAG GTTATATAAG GGTTTATGAA CAGAAAATTT TCTAGCTTTA 

612 51 TCTCCACTTT CTTGAAACGC AGTTTGATTA GAACGAACGG CAATTGCTTG 

613 01 GATAAAAGGA GTGAAGTGGA GAGGTCGTGA TCGCCATTGT AGAGATAGAG 

613 51 AGCAAGAGAG AGCCGCCCCT AATGTCGTAC TATAACATTT GCCTTCCGTT 

614 01 TGTATTTTTC CAGAATATCC AGATGCTTTG ATATGGTGGT TGCTGTAGCT 

614 51 GTAGGCTAGA GATGCAGATG TAGAGAATCT CTCTTGCAGC CAAGGATTAT 

615 01 TGATCTGGAG CGCTACAGTT GTCGTATGCG AAGCCACGGA ATTGTCGGAG 
61551 TGGCTCTCGT AGAGATTACT GAAAAGTTGG GAGAAGTTTA CACCAAAGCT 
61601 ATGATTAGAA GCAGTGTTTG AGGTTGTTCC CAAAGAATAA CCCGTAGCTT 
61651 CCATATGGAA TCCTTTCGCA TCATTGTTGC TATTTTGATG CACGAAGAGT 
617 01 CGAGTAGCTT CTCCAGAAGC TGTAGGTGCT ATTTGGCCTT GCTGTGTTTG 
617 51 ATAACGTAGT GTCGCAAATA AGTTATGGAA AGATTGCCAG AAGGCAGATA 
61801 GGGCAATGTC TCCTTTGTTT TCTGGGTTTA CCTTATATCC TGTAGGTGTC 
61851 CAATCACCAT AAAGCTGGCG ATGTAAAGTA TTCACAGTAT CTTCAGAAGA 
61901 GGTATCAGAA GTTGTGATTG TTTCGATCCA GTAAGGGGAC CAAACGCCTT 
61951 GGTAGCCGTA GTGTTGAGTT GTATTTAGAC CCTCAGGGTA GAAATTATCC 
62 001 GTATTAATAT GTTTAGCTGT GACGTCTAAG AGATACAGAA GAGGAACTTC 
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62051 TGCGATAGGT TGGGCAAGGT CTGCAGTATC ATAGGGATCT AGGTTCTCGT 

62101 CATCCAGTAG GCTCAAAGGT CCTGAGAGAT TGATTATAGG GTTATTATCT 

62151 TCGCTATAGG GTGCTGATGA ACCTGTGGGG CGAATCCATA GCTTGGGAGC 

62201 AACTCTGTTG CCTAAGATAG AGGGAAGGTT AATTGCAAGA TTATTGATGT 

62251 TAATTACAGA ACCCACACTA CTGCTACTTT GTTCTTCGTC TGTTGTAGAA 

623 01 AACACAGCTC TACTGCCTAA CCGTAGAGTC CCACCAAATT GATCAAATTT 
62351 ATAGACTTTC CACTCTGCTC GATCTTCAAG AGCGAGTGTG CCGTTGTACA 

624 01 GTCCAATGTG GTTTCTGAAA TGTGAAATGA AGTCATCACG AGAAGTCGAT 
62451 GTATCCGGAA TATATGTTGA GGAGAACAAG ATAGTTCCGA GGTGTTCTGG 
62 501 ATTAGGATTA AATTTTTGGA TAGAGTTTTG TATAGTATAT CTTTGTAGTA 
62551 TGGGATCATA GAAGGTAGCA GAATGACCTT GACTTGCTCC AACTGTTAAT 
62 601 GAGACATTAC GCGTGCAGTT TACAGAAACA TGATTGCTGA AAGTATCTTT 
62 651 GAAGTGTCTA T TAT TAT AAA AAATAATATC TCCCTGATCA GCAAATAAAG 
627 01 TGCATGCACC ATCTTGACGG AGCATGATAG CGCCGCCCCA AGTTCCCTGA 
627 51 TTGTTTGTGA AATAGACGGG ACCACTGTCT TGTATAGTTA GAGATTGTGT 

62 801 AC AG AT AG C A CCTCCATCTC GTGCTGCAGT ATTATTATCG AAGGCTGCAA 
62851 TTCCTGGGTT GTCTTTTATA GAACAACTAA TGCAATAGAT AGCCCCTCCA 
62901 GAGGAATGGT TAGCAGAGAT GTCCGCTTCC ATGGCAAAAT TATTGTTGAA 
62951 GATCACAGAA CCGGTATTCT TTGTAAGAAT GCACTCTTGA TGTACTCTTA 

63 001 TTGCACCACC CAGACCTGAT TGGTTATTCA AAAAATAGAT AGGCTGAGAA 
63 051 TTATTCTCAA TTCTACAAGC ATTAGCGAAC AACGCGCCCC CCGCTGTTCC 
63101 GCCTGCAGCA TTATTAAAAA ACAGGCAAGG GCCAGTGTTG TCCTTAATGT 
63151 TTATGATTGC AGCTTGGATT GCTCCTCCTG AAGATTTTGC CTTGTTGTTA 

632 01 ATGAAGTATG CGGTTCCTTG ATTTTTTGAG ATTGTAACAT TTTTCGAACA 
63251 TAAAACAGCT CCCCCTGTAC AAGTATCAGC GAAATTACTT GCATTAGGAA 

633 01 AGCTTAAATT CCCAGAGAAA ATGATGGAAC CATGATTCTC AGAAAGATCG 
633 51 AAATTACCAT TCACATACAT CGCACCAGCT CTTTTAATAG CAAAGCTATT 
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634 01 TAGGAAAAGA ATTTGGTTTT TTGTATTCGT TATGGCAAGT GATTTGCAAG 

634 51 AGAGAGCACC GCCGTCTTGA GAGAAGTTTT CGAACCAGCT TTCTATGGAA 

63 501 TTCTGGTGAT CGAGGACAAT GTCTTGGTTA GTGTCATCCC TAACTCCAAA 

63 551 AAGTGTTGCT CTATGAGAGT AGGGAGTCAT GTTAGTAAGA GTATCAATTA 

63 601 GAGGGAAGAG TGTTGTGAGT TGATTTGCTT GATTATCAAA ATAGTCAGAC 

63 651 AACGGAGTCG CATTAAGGAG TATTGTAGTT TTACCTAAAA TTAAGGCTCC 

637 01 AACAAAGAAG GAAGATTTAC TAAGGGATCT GTTATTTTGC ACTGTTTATT 

63751 CCTTGATGTT TTCTTTGTGG TTAAAATGTG CAATGACTCT CAGCTTTAAG 

63 801 GTATTGACCT ACAGTTGACG AGGAGACTTG AGCTGAATAA TCTAAGGATA 

63 851 AGGTTACTCT TGAGAAAAGT TGGGAAGTAT TTTTTATTTT TGCAGCTACG 
639 01 GAATTATAGG AGACGGGAGT TGCTTGTGTT GTCCATGTTC CATTGCTGAT 
639 51 GAGTAGTGTC GTGAACATTT CTGGATTTTT TCTGTATAGG GTAGGTACGT 

64 0 01 AGGATATTTC CGTAGTCCAT AGCATGGGGA TATGATGTGA AGTTTTCCAT 
64051 TCAGAACGGA AGCCTATGGG AGAGGAAAGA TCTGTAAGGG GATGTTTTGG 
64101 ATGGAATTTT CTTATATGGT CTCCAGTTTC TTGGAACGAG GCCTGGGAAC 
64151 AGCGCAGAGC AATGGCACTG ATAAAGGGCT GGAGTTCGAG AGTGCGGGTG 
642 01 ATTCTAGCTG GTAAGAATGT GCAGTCTAGA GAGGCTACCA AAGTGTGGTT 

642 51 ATTAAAGAAG GCTTTGGACG ACCCTTTTAA GATTTCTGTA TAGTGGCAAA 

643 01 GCATATGGTG ATCTCCGTAG CTATAACCTA GGGATAGCCC TGTAGAGATG 

643 51 AAGTCCCTGA AGAGGAGACT GTCGAAGCGG AGTCCTGCAA AGTAGTTGTG 
64401 GGAGGAAGTC GTACTTGGAG ATTGACGTTC TCTAGTTTTG GAGAACATTT 

644 51 GTGCGAATCC TAAAGAGAAA CTATGTCGTG CTGCAGTTTT TGCTGAGGTT 
64501 GTTGCTGCAT AGCCCGTAGT ATGGTTTCGG AAGCCTTTGC GTCCCTCGCG 
64551 ATTATGTTGG TTAATTAGAA GCCCGAGTCC TTGCAGAGAG GCTTCAAGGT 
64601 CATGCTCTTT GAGGTTTTGT GGAGGTAAGA TGCGGATTCC TAACAGAGCG 
64651 TTATAGGCAG ACTGCCATAA GGTATTAGCA ATAAATTCTC CGTGACGTTC 
647 01 CGGGTTAGGG CGGTATCCTA CAGGAGTCCA GTCTACGTAG AGCTGCCTGT 
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64751 GGTTTGTATT GGTCTGTTCC 

64801 TCCATCCAAT AGGGAGACCA 

64851 GTTCATGGCT TCTACAATGA 

64901 CATCGAGGAG GTAGAGGAGG 

64951 TAAGCTATCA TAGGGGTTTT 

65001 CTGAGAGAGT GATAGTAGAA 

65051 AATGTAGGAT AAATCCAGAA 

65101 AGAAGGAAGA TTGATCGCGA 

65151 CTTTCGTCCT GATTAATGCT 

65201 TGCGATAGGG TTTTGCAAGA 

65251 ACCCCTTTCA AGTCGTGAAG 

65301 GATTTGTTGC GTTAGAATCT 

653 51 AGGTGATGGG GTTCATAATT 

65401 GTGATCTATA GGATCATAAA 

65451 GTTTTAAGTT AATCCCCGGA 

655 01 TGAGGAGAAG AAGATGTGAT 

65551 GTCTGCAGAG AGGAAGAAAT 

65601 TGATGAGAGC CCCTCCTGAA 

65 651 CCGTTATTTT GGAAGATGAA 

65701 TGCTGCCGTT TTATTGTTAA 

65751 CTAAGGAGGT ACACATGATA 

65 8 01 GCTGTGTTAA TTGAGGATGC 

65851 ATTTGAAGAG ATAATGACTC 

659 01 TGAAGCATTA TCATTGATTT 

65 951 CGGCTCTGCA GAAGATGCCT 

66001 TTTTTCTTGA TAACGATAGG 

66051 CCCAACATAG ATAGCTCCTC 



GGTACTGTAG AGCTTGTTGT AGTCGTAGTT 
GATTCCCTGA TATCCATAGT GCTCATCTAA 
GATTCGAAGT ATCGATTTTT TTTGCAGTCA 
GGGGATATCC TTTCGAGGTT CAGAGAGATC 
CATTTTCATC GTTTAGAAAA GTCAAGGGTC 
GAAGTGTCTT CAGAATAGGT GGATCCTGTT 
CTTTGGAGCT GAGGCTTCTG ATTGTAAAAT 
TTGCATTAAA ATTTATGGAG CTTCCCGGGC 
GCGTTTCCTA AACGTAGAAT GCCCCCAGTT 
AATAGCAGCC CGATCTTCAA TAGCGAGCAC 
AGTTAGAAAA TTTTGATAGG AAGTTCAATG 
ACATTGATTC CGGAAAACAA CACGGTGCCA 
AAATACTATA GGATCTGTTG TCGTCTGATC 
AGAGAATTTT ATAACCCTGT CTTGCTCCTA 
GCAGCATAGA GTGCATTTCT ATATCCGGGT 
TGTATTATTG TTAAATAGAA TATCGCCGTA 
TTTGAGGAGT ACTTCCTATA CCAGAAAGAT 
GTCGCAGAGT TATTAATAAA TGCTGTCGGA 
AGATCTCGTG TGTATAGCTC CGCCGCTAAG 
AGATAAGACC TTTGGGATTG TTCTCAATGA 
CCGCCACCAC CGGGATATAG TTTTCCTGAT 
GGAGTGATTG CTGATCTCTA TAATTTCTTT 
CTAGGGCAGA AAATATACCA CCCCTTGTCC 
GGATGTTTTG ATAATTCCTT TCTATATTTA 
CCTCCAAAGC TGGAATCTTC TAATGTTTGA 
ACCTAAGTTG TCTGTGATTG ATAAACTCAC 
CTCTATTTTT AGAGACATTA TCTAAGAATT 
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66101 GTCCTTCTCC TAAGTTATTT 

66151 CCACCAAATC CTGAAGTTGT 

66201 AAACGAGTAG TTCTGATTCT 

66251 GAGCCCCTCC AGAACTGTGG 

663 01 TTATTTTCAG AAATGAACAA 

66351 GTAGTGGTAG TTATAATCTA 

66401 TGATGTCCTG GGTTATATCG 

66451 CGAAAGGAGG AAACCAAGGG 

66501 AGACGTGTGG AAAGCAGAGA 

66551 GGTTAGGGAC TTCATTTCCT 

66601 ATGCTGAGAC AAATGGGTCG 

6 6651 GAGTTTGTCA TAAAAAATCA 

66701 ATATTTAAAT ATGCTTATTA 

66751 CATATATTTG AATGATTTTC 

66801 GATTGGGCTA ACGTAGCTCA 

66851 TTTTATCGTT CTTTTATTTT 

66901 GGTGCGTAGA TGTCGAGGAG 

66951 TCGAGAGAAC GGAAGAGCGC 

67 0 01 ATTGCGAACA TAGTTATGGC 

67 0 51 GTAGCGTGAC ACCGATTTGG 

67101 GAAAGTTCTA GAGTCCATTC 

67151 TCCTTGGATT CCTAGAGGTA 

67201 CAAACTTTCG GGGATTGTCA 

672 51 CGTATTGCAA TTGCCTGAAC 
67 3 01 TTGCCAAGGG AAAGAACAGC 

673 51 ACGTCCCTTC TGCCTGTTCT 

674 01 TGGTCCCCAT AGCCATACGC 



GTAATATAGC TATCTAGTGC ATGTATAGCC 
AGTAGTAGCT AAGGAAGCCG CATTTGAAAT 
TAGAAATCCA GCATTCCCGA ACACTGTAGA 
GAGCTATTCC TTTCAAAACT TAAGTTTCCT 
ATTTTTACAT GCAAGAATGC CTCCATCCTC 
TAACAGAATT GATAGAGTTT CCTGTAATTG 
TGACCGAATC CATTAAGAAG ATTAGAGGGA 
CTCTGGAGTT ACATTCAGAC GGAAGCTTGG 
TGTCTTTTCT AGACATCTGA CAAGAGGCGA 
GATAAGGAAC AACATAGCGC AGTGGATAAA 
CATAAACATC CAAACTTTGA TATGAAAATA 
TCAATTTGTT GTCTAGAAGT TTATAAATTT 
ACATGTTGAA TATTACAGAG TTAGTAAACT 
TATATTTAAA TTTTGAGGGA GTCCTCTACC 
TCGGATAGTT GTTAATTCTA AAGATTTCAA 
AGAATTTTAA GGTACTTCCT GCTTGGAGAT 
GAGACCGATC CTTGGTAATC CAAGAATAGA 
AGTTTGATTG TGGACTTTGT ACCCTAAAGC 
CTAGGATATC CCAGGAACCT CCGCTCGCAA 
GGATTTTGTT GATAGAGTAC CGGTTGGTAA 
TGTAGGTACG TGGAATTTTG ACTGCCATTT 
AGGTCAGATT ATAGAAAGGC TTTTGAGAGA 
CCAATCTCTT CGAACGCTGT TTGGTGAGAA 
GAACGGGCTG AGGTGAAGAT AGGATTTCTG 
CGATAGCTGC TGCTAATGTA TGGCTATAAC 
TGATGTGAGG GATGTAGGCT GTGGAGGTGA 
TAACACTGTG GATGTTGCAA AGGCCTCTTG 

50 



67451 
67501 
67551 
67601 
67651 
67701 
67751 
67801 
67851 
67901 
67951 
68001 
68051 
68101 
68151 
68201 
68251 
68301 
68351 
68401 
68451 
68501 
68551 
68601 
68651 
68701 
68751 



GAACCACGGA 
CGTTGTTGCT 
CCTAAGGAGA 
ATACCCTGTA 
GATGAACAAA 
TCGATATCAG 
AGTATGAAAG 
CAGGATTGAC 
TAGAGGGTGT 
CTCTACCCAA 
CATTTAAGCT 
ACATCCGATA 
TATACTGTCG 
CTGATAAAGT 
TGTAGAGGAC 
GGGGAGGTTA 
TTGATGGAGT 
AATTTAAGGA 
ACGATCCTCG 
AGCTGCTAAT 
GAAGAAAATA 
AGGATTTGTA 
TATACCCCTT 
ATGGCGTTGT 
TGCGATATTT 
GCATAAGAGC 
CCACTATTTT 



AGCTCAACAT 
TGATCCGATT 
TTTTCTGATG 
GATTGGATAC 
GAGGCCGTCG 
AATCACCAGT 
GATTGCCATA 
CTTATATCCT 
TTGCCGTCTC 
TAAGGAGACC 
TTCAGGATGA 
AAGAAAGAAG 
TAGGGATCGC 
AATTGTAGGG 
GGATCCACAA 
ATCGCAAGGT 
CTCAGAGTTG 
TACCTCCTTT 
ATAGAGAGGA 
GAAATTATTT 
AGATCGTTCC 
GTTGGATGTT 
ATTAGCTCCA 
ATCTACCAAA 
CCTTGTTCCG 
ACCTCCCCAG 
TGATTGTCAA 



AAAGTGAAGA 
TCTTTAGTGC 
TAAAGAAGTT 
GGAATCCTGG 
GCAATCCCTT 
TCGATTATAA 
GGGGAGTCGT 
AAGGGAGTCC 
TATAGAAGCG 
AGATGCCTTG 
AAGTTATCGG 
ATGAATGTTT 
GGTTTTCCTC 
TTATTGTCCT 
GGTAGGAGCT 
TATTAATGAT 
GCAGTTGTTG 
TTGAGTGAAC 
CACCATTGCG 
TCGTAGTCAG 
CTGATGGTTC 
GGTGTTCTAT 
AGTTGTAAGT 
TGTGGTGAGG 
CGAAGAGTAG 
TTTCCTTGAT 
AAATTGTGTA 
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GACTGTATTG 
GGGTGAAGAA 
TCGGAGGATG 
AGCCCCGGGG 
GAATTTCTAA 
CTTCTTAATA 
AGCAAGATCT 
AATTGGCATA 
TTATTTGTTG 
ATAACCGTAA 
TATTGATATG 
TGTAAAGGCT 
ATTTAAGAGT 
CTGTGAAAGG 
TTTCCTTTTG 
GACCTGGGAG 
CAATACTCGC 
TTATAGAATT 
AAGTTCAGAG 
AAGCTTCTGG 
GCATTGGGAT 
AGGATCAAAA 
TGCTATTTGG 
AAAACCTCAT 
GCAGGTGCTG 
TGTTGGTGAA 
CAGATAGCTC 



TGAGCCGAGA 
CTGTGCAAAA 
CTTGTAAGGA 
ATGCTATTTT 
GAAAGGCCTC 
GAGAGAACAT 
CCTTGGTATT 
CAGAGCTCTG 
TTGTTATCGT 
TGCTCAGTCG 
ACGTGCTGTT 
CAGAGAGATC 
GTCAGAGGAC 
AGCACTAGAT 
CTAAGATCGA 
CCTACACTAG 
CGCATGCCCT 
GCCATCCCGC 
GTATTTTTCG 
GATATAGGCT 
TAAAGATTAG 
AAAGCAGTCG 
TGTACAATGT 
TATTTTGAAA 
TCCTGTAGGA 
ATATACGTGG 
CGCCATCGCG 



68801 AATGCAGTAG TTATTATTGA 

68851 ATAGGTTTGT TGTATAAATC 

68901 GAACCAGATA ACGCTGTGTT 

68951 TTTTATCGCA ACAGTAACGC 

69001 AGTTGTTCTT AAAATAAATA 

69 051 TCACTACGAA GCGCACCCCC 

69101 TAGAGGTGCC CTGTTGCTTT 

69151 CTCCTCCCCA GTTGTTGACG 

69201 TTTTGAGAAA TCGTGAAGTC 

69251 TTCTCCTCCT GTACTCGCAT 

69301 GGGGTCCTCT ATTCTTCGTG 

693 51 CCTCCAGTCC CAATCGCGAG 

69401 ATTTGATAAT AAGAAATTAT 

69451 AAAGAAGAAG GATGTTATCA 

69501 TGAGAGGAAT TATGTCTATT 

69551 TGTAAGAGTG G AG AG AG AG T 

69 601 CATCCCTCTC TAAAGCAAAC 

69 651 ACTGCAGGAG TCATCCCGAA 

697 01 TAAAAACTTA GGAGGAGTCT 

697 51 TTTGGTCACT TATTGTTAAA 

69801 GACTCAGCGT GGAGGAAGAA 

69 851 ACTTTAGGAA ACACCTGCAT 

69901 GTAGGTCACA GGAGTGGCCT 

699 51 ATTTCGAGTG GAGTTCAGGA 

7 0001 ATTTCTGTGA GCCAGACTAG 

7 0051 GCGGATTCCT ACAGGGAGGG 

70101 ATTCCCGAGC ATGGTCTCCA 



AAAGAATAGT TCCAGGGTTA TCGTCTATGG 
GCCCCTCCTG AACCATTTCC TGAATTTATC 
GTTATTGAAA ATCACCGACC CGGAGTTATT 
TTGTTTGAAT GGCCCCGCCA TTGTTCCCAC 
GGACGCGTGT TATCAGAGAT CGTTGTATTT 
TCCACTAGGG GCTGTATTGT TAAAAAAGAG 
GGATGCGGCA GTGTCCATTG GTGGAGAGGG 
GAATTGTTGA CAAAGTAGAA AGTCCCTTGA 
TCCATTACAG GCAATCGCAC CCCCACGAGT 
TGTTAAGACC TCGATTGCTG AAAAAAATAA 
ATTGTGCAGG CTCCCTGGCA AGCAATCGCG 
ATTTTTACTG AAGAAGGCAT GGTCTTCAAC 
TACAGGACAC AGCTCCCCCA GCCGATGTCC 
ATAGACTTGT AGTTAGAAAG TACAATGTCT 
TCCAACAAAC GTAGTTATTG GAGAAAATCC 
CTAAGAGAGG AAAGCTCGTA CGAAACTCTT 
TTTTCAAGGG AGTCCGTTTG TAAACTATAC 
CATGCAGGCT GTGAAATTCC CGAGATAGAA 
TTGACACTAG AGGTTCCTTA ATCTTTCTGT 
ATCTCATTCT ACTCGCCACG TTTAAGTAGT 
ATATCCGCAG AGTAATCTAA GGAGAGAGTG 
GGTATTTTTC ACTTTGATCC CTAAAGCATT 
GCGTCGTCCA CGTACCTTGG CTAATCAGTA 
TCTTGCCTAT AGAGAGTAGA GCGATAGGAA 
GGGAACTCGG TGGTGGTTCT TCCAAGAAGC 
AGACGTCCGT TAGGGGGCGG TGTAGGGAAA 
GATTCTTGAA ACGCAGCAAG ATTTCCTCGG 
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7 0151 ATGGCTAAGG CAGTAATAAA 

70201 TTGAGGTAAG AAAACACAGG 

7 0251 AAGATCCCTG AGAGTTCCCT 

7 0301 TGTTCCGAGG TAAACATATA 

7 0351 TTTGAAGAGA GTATTTTCTA 

7 0401 ACGTGCTATT TTGAGATTCA 

7 0451 AATCCTAAAG AGAAATTCGG 

7 0501 ACTATAACCT GTCATATGAC 

7 05 51 TTTGATGAAC CAGAAGACCA 

70601 TCTTCATCCC AGGAGGAGAG 

7 0651 CGATTGCCAC AAGGCATTCG 

7 0701 GACGGTAGCC TAGAGGAGAC 

7 0751 TTCGCGCCTA GTAGAGATGT 

7 0801 ATAGGTCGAC CAGATGCCTT 

7 0851 TGGATAGATC CAGTTGCGAA 

7 0901 ATATAAAGAA GGGGAACTTT 

7 09 51 GTAGGGATCT TCGTTGTTGC 

71001 TGATTGTCGG GTTGGAATCT 

71051 TAAATCCAAA TTTTTGGAGC 

71101 GTCAATGGCA ATGTGATTTA 

71151 TTGAGGATGG TGTGGGAATC 

712 01 CCTAGAAGTA GAGTGCCTCC 

71251 CGCACCATCT TCAACAGCAA 

71301 TTAAATAGGA AAAGAAATTC 

71351 TCCCCTGAAA ATAAAACTGT 

71401 TATGGGGAAG GAGGAAGGGA 

714 51 CTCGATAGCC GGGACGGGCT 



GGGATAGATC TGCAGGGACT CGCCGTGAGG 
AGAGAGCCCC TGCTAAGGTA TGGTTGTGGA 
TCCAGGAGAC CCTGATACAT TGTATGGGTA 
AGCAAGAGAC ACAGATAGAC GTATCCACTC 
TGCACATTCC AGAGAAATAG TGGTGAGAGG 
TGTTCTTTAG CTTTGGAGAA GAACTGAGCA 
ACTTTGAGAA GAGGTTGCTT CGGTGGTAGC 
TACGAAATCC CTTAAAACCG TTTTTGTCTT 
ATGCCTTGTA GGGAAGCTGC ATGACCCTTC 
GGAGTGGAGT CCTGCAAGAG CCGTATATGC 
TAATGAATTC TCCTCGACGT TCGGGATGAG 
CAGTTTGCAT AGAGCAGCTT GTGTTTTGTA 
AGGGTTCGTG ATTGTTGTAG TTTCTACCCA 
GATACCCATA GTGTTCGCCA GAATTTAATG 
GAGTTAATTT TTTGTGCAGC GACATCGACA 
CTCAAGAGAG TGCGAGAGAT CCAGACTATC 
TGTTGCGTAA GGTGAGAGTT CCTGAGATTG 
TCAGTATAGG TAGATCCTGT TTTTGTGGGG 
CTGAGCTTGA AAAGAAAGAA TAGAAGGAAG 
AAGTTATAGT ACTTCCTACT GTCGTTGGTG 
GTTCCTGCTG TCGTGATCAC CGCACCTTGA 
TCGTTGGAAG AACTTATAGC AGGCCAGCCC 
GGACTCCTTG ACGTAGTTCC GAAGTGTTCC 
ATTTCATCGG TAAAGTTCTG GTGTACATGT 
ACCTGTATGA CCGGTTTCGA AATTAAAGAG 
GCTCATGTTC TATGGGATCA TAGAACAGCA 
CCTATTTGCA GATTCATATT AGGAGTCGAG 
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71501 TGAATGGCGT TTCTGTATGG AGGATTGAGG GCATGCTTGG AGGCCGTATT 

71551 ATTGTTAAAG ATAATATCTC CATTATCTGC AGATAAGATG AAGCTTCCGT 

71601 TTCCAGAACC TGCTGATAAG TTGAGGAGAG CCCCTCCCCG AGTTGCAGTG 

71651 TTATTTAAAA AGTATACAGG ACCATTTTCT TTGATAATGA TAGATTTCGC 

717 01 ATGAATGGCT CCACCGTTGC TCTGGCTTTG GTTATTGTTA AAGAGTACCC 

71751 CTTCTCGGTT GTTCAATATC GTGCAAAAGG TGGTAGTAAG ACCTCCTCCT 

71801 CCTGGATTGA AGTTCGATCC ATAGTTATTT GCGAACGCGG AATTTTCACT 

71851 GATTTCTATG AGTTTTTTAT TCGAGGAGAT CGTGAGTGTT TGGGTAGAAA 

719 01 ATATGCCTCC CCCAGATCCT GAAGAGTTGC TTGTGATCTG TATAGCTCCA 

71951 GAATTTCCCT CTATATTTAG AGAGTTCCCA CTATAAATCC CTCCTCCTAA 

72 001 ACTGTCCGAA TTTAGTGCCC GATTCTGCTT GATTATGATC GGGCCTTTAT 

72 051 TGTCTTTAAT AGATAAGTTC GTCTCAGTAT AGAGGGCACC CCCCTTATTT 

72101 AAAGCGAGAT TGTCAACGAA AGTTCCCTGT CCTAGGTTAT TAGTAATAGA 

72151 GCAATTTATG GCAAAGAGAG CTCCACCCAA TAGTGATCCC GCAGTGGCTG 

72201 TAGGATTGTC AGAGACCAAG TTTGTAGTAA ATGCATAGTT CTGATTCTTG 

72251 GAGATCGTGC AATTTTGAGC AGCATAAATT GCCCCGCCAG AATTGGGACA 

72301 GACATTCTTC TCAAAGAAGA CATTCCCTAT ATTTTCAGAG ATCAGAAGAT 

72351 TCTTACAGGT AAGAGCACCT CCATTCGACC GATAGTACTT ATAGTCCAAG 

724 01 ATGAAATCAT TGTGATTCCC GACAATTGCG AGATCTTGAT TTTGGTTATG 

724 51 AGTAAACCCT ACTTGAGGGG CTGCTTGATA TTCAGGACTT AATGTAATAT 

72501 AGGTCTCCAA AGGAAGTTGG AGACCTTCAT TAGCCAATAC AAAAGTAAAA 

72551 GGAAGCAACA TTCCGAAGCA AAAAAAGCGC ATACTCGTTC ACAAATAGAA 

72 601 AGAAATATTC TGAATAAATC AATGCTAGAT TTTTTGTCTA TAATTTGTTT 

72 6 51 TATAGTGTCT AATTATGAGT AGTTTAAAGT TATCTTATTT AAGACAAGAC 

727 01 GATTTTAATT TCCTTTTTTC ACAATAAAAT GCGAGTCCAA GAGTTTTAGT 

72751 AAACGTACAT CATACTTTGG AAAATGGCGG CAAGTAGTCA GAGGATCAAG 

72801 CTCTTTGCAT ACTTTGCTTA GAGCAGTTTT TGGTTCTCAG GGTCGTTTGC 
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72 851 CTTTAGGGAG CTTGTGATGT TTGGGATCGG AGAGCCTCCT TATATCCCTA 
72901 GGGGATGCCC TAGGAACTCT TCCGAAACAC CGAGGGTCTG TTAGAGATAA 
72951 AAACAAAGGA CCATCGGGGA GACTTGTACT CATAAGAGCC ACTTATCTCT 
73001 ATTCCTATAG ATTTTTGCTG ATTTTGATTA TTAGAAAATA ATAGATTTTT 

73 051 TCTGTTTTTT AAAGTAAAGT ATTTTTTAAA AGACTCATTT TTAATGAGTT 
73101 ATTACTTTTC TCTTTGGTAT CTGAAGGTGC AACAGCACTT TCAAGCAGCA 
7 3151 TTTGATTTTA CTCGCTCCCT GTGTTCACGA ATTTCTAATT TTGCTTTGGG 
73201 AGTGATTGCA TTGCTTCCTA TTATTGGGCA GTTGTATGTA GGGCTGGACT 
73251 GGCTCCTCTC TAGGATAAAA AAGCCAGAAT TTCCTTCCGA TGTGGATCAG 
7 3301 ATCGTGCGAG TAGAACACGT CGTGGGTCAC GACCATAGAA GTCGAGTTGA 

733 51 AGATATTCTA AAGAGACAAA GGCTCTCATT AGAGCCTAGA GACGAGGGGA 
7 3401 AGGTTCACGG AGATGTGCCT TCAGCTCCTT TTTTTTGATA TCCAAAGTCT 

734 51 CAAGTTCCTA CAGTTGTTCT CTGAGGGGAC AGCTCTAAAT TTATTTCGTA 
7 3 501 TATTTGCTCC ACTACGCAAC CGTGTGACTA CAGAATACAG TCGTGCTAGG 
73 551 CAACCCGACC TACATAGAAT TGCCATCGTC TATATAGGAG TTCTCGATTC 
73 601 AGAAAGTTCC AAGATCCTAG AGCGGCTAAT CTCTTATATG AGTTGTATCT 
7 3 651 ATTCTGAATC GCAAATGTAT TTAAGATTCT TTATGGGCAA GAATGTAAAT 
737 01 CAAAGTGCTG TACTCTCAAA ATTACATGTA GAAAATCTGC ACATCCGTTG 
73751 TGGGTTTTTC AGCGAGGATG CTGTTCCAGA GAGTGAGCCC TTCGATCTCT 
73801 CCATCTACGT GCACACAGAT CGTAGCTGTC CTCTCCCTAC GAAAAAACGG 
73 851 AGCAGCTCCT GGGAACTCCA AACTGTAGAA CTCCCAGAGT CAATATATCC 
73 901 ACAGTCGGAA TTCCTATTGA TGAGACCTCG AATGCTTTCG TAGACTCTAT 
73951 GATGAAACAA GGAGTCGGGC AGGATGCTAA AGAGCTATAC ACATTTCTAT 
74001 CTCGTGGGAA TGAGCATTAC CAACCGTGTC TATGGTTCAG TCTCGAAGAG 
74051 GAACTCGGAT TCCTTTTCGA TGAAAAAATG CTCTGCGCCC CTCTATCTGA 
74101 GGATCACTAT TGCCACTCGT ATCTTGTAGA TCTAGTGGAT CAACATTTAA 
74151 AGGATTTAAT ATTATCGATG TTTTTAGATC CTCAGAATAT CTCAGCAGGA 
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7 4201 GAACTCCTCA AGGTCTCTAT AAACGTTGGA GATTCTTTTT CTCCTCTACA 

74251 ACAGAAAGAT TTCCTCTCGA TGGTCTTACG TGATGAAACG GGAAAAAACG 

743 01 TCGTCGTGGT TTTTAAAGGA GTTCTCTCCT TACCCGCAAC CC AAGTCTGC 

74351 AAATTAGTAG AGGAATTGAA CTCTAAGGAC TACTCCTACC TCAATATATT 

74401 TTCTTGTCAC GGAGATAGTA GTCCTCAGCT TTTATTCCGT AAGGAATTAG 

74451 AGGGAACTTC AGGGCGTTAT TTTACAGTGA TTTGCGCTTT ATATCTAGGG 

74501 GATACAGACA TGCGTAGTTT ACAACTTGCT TCTGAAAGGA TCATGGTCTC 

74551 TAGAGAGTTT GATCTTGTAG ATGCCTATGC TGCAAGATGC AAGCTCTTGA 

74 601 AAATCGATCA TACAAATTGG AGACCTGGAA CTTTCAGTCG CCACGCCGAT 

74651 TTCGCAGATG CTGTAGACGT ATCAGCAGGA TTTAACTCAA GAGAATTTAA 

74701 ACTGATTACG CAGGCGAATC AAGGGATCCT AGAGTCTGGA GAACTCCCGC 

7 4751 TCCCTTCAAA AACCTTCTGG GAAGGATTCT TAGCATTCTG TGATCGAGTG 

74801 ACTGTCACGA GACACTTCAT TCCAATGTTA GACGCCGCTA TAAAGCAAGC 

74851 GGTATGGACT CATAAACATC CCAGCTTGAT AGATAAAGAG TGTGAAGCCC 

74901 TAGACTTGAA AACACAGTGC TTGCCATCTA TCGTATCGTA CCTTGAATAT 

74951 GTCACAAACT CTCACGAAAA AACATCGAAA GGCCCGTTCA TACAAAAAGA 

75001 GATTATCGCA GACTGTTCTC CTCTTAAAGA GGCGCTCTTC CCAGGTTCTG 

75051 ATGAAGATGT TCCCTCTACC TCTGAGGATC CTTCAGATGA TCATCCTTCG 

75101 GATCTTGAAG ACTCTTAATT AGTTGCGATA GAATTCAATT TTTTATATAA 

75151 AAACTATCGT GTTGTTCTTA TTAAAAGATA GTTAATTTTC TATCTTTTTT 

7 5201 TAAATCTTTA TATAGCCTGC GTACGCTTTC ATTTTCAATG TTGGTTTGAT 

7 52 51 CCTATGGCAT GCTATATTTC TATTTGGATA TCTACAGTTA AGCAGCATTT 

753 01 TATTAGGGCT TTTGATTTTA CACGTCCTCT TGGTTCTCGG ATTACAAATT 

7 5351 TTGCTTTGGG GGTCATCAAG GCTATTCCCA TTTTAGGATG CGTTGTTATA 

75401 GGGGTAAGTT GGCTAGTTTC CACATGTTCT GCACGAAGGT TTGGGAAACC 

7 5451 GGCATTTACT TCTGACGTTG CTAGTATCGT GAAAATAGAA AAAACTCGAG 

75501 GTTATAATCC CCTTGCTTGG GTGGAACAGT ACTTGAGACA GCTTAGGGTT 
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75551 CGACTTCCTG AAGGAGATTT AGGAAAAATC CATGGGAAGG TCTCCAGAGA 

7 5601 TTATGTTTGC GACAGGACTC CCCAAGAAAA TCTGAATATG GTTCCTCATC 

75651 AATATCTGGG AGAGCTAGGT CGCGCGTTTT ATGGAATCCG CAACCGAGTA 

75701 ACCAAGGCGT ATCAACGAGT CACTCCTCTG GAAGTCCCTT GTCTTACGCT 

7 5751 CGTCGGTTTT GACATTTTAG ATCCCGAAGA TCAGGTGAAT TTCGTTCGTC 

7 5801 TGGCTAACGG CATACAAACT CAGTACCCCC AAACTCAAAT AAAACTTTAT 

7 5851 TTAATCTCTA TCCAAAAGAT ATGGAATCAG TGTGACGGTA CGATTTCTCA 

7 5901 AGAAAAAGAA CAGCAACTCC GCTCTCTAGG TTTGGATGCT AAAATCAAAT 

7 5951 GTGTGTCGGC CCCCGCTCTC CTGCTCCAGA AATATCTTCA ATCCGAGAAC 

7 6001 TTGCCTTCCT GTGATCTTCT CATTAATTAT TACGGGAAAC AACAGTCCGT 

7 6051 CAGAGACGTG GACTCTATAA AGAGTCTACT CAATCTTTCT TCCGAACATA 

7 6101 TCCCTGCGAT TTCTGTAACC TATAGACCTG ACGATCCTTT TTATAGCTAC 

7 6151 TATTTCTTTC CTGGTTCTCA AGGAGGAACG GCACCCGATC AGAGGATCCC 

7 6201 TTGGAGTGAG CAGGAGCATC TTCAAACGTA TACCACCCTG TCTAACCCTA 

7 6251 GATGTGATAG ATATGCTGTT CACTTGGGAA TGGAAGATTT TGCCTCTGGA 

7 6301 GTATTTTTAG ATCCTCTTAG GGTTTCGGCT CCTTTATCTG GAGAGTATTC 

7 6351 CTGCCCCTCA TACCTCTTAG ATTTAAAAAG TGAAGAGCTT CGTTGTTTCT 

7 6401 TGTTATCCGC TTTTATAGAT CCCAACAATT CTGGTCAGGG AAATCCGCGT 

7 6451 CCTATGTCCA TAAACTTTGG AAACTCTCCT TTGGGTCAGA GGTGGTCTGA 

7 6501 GTTTCTATCT CGTGTTCTAC ATGATGAAAC AGAAAAGCAT GTGGCTGTAG 

7 6551 TCTGCAATAA TCCACAACTT ATAAAAAAGA GTTTTCCCTC ACATTCTTTA 

7 6601 TCTCTATTAG AGAACGAACT GGAAGAGTCA GGTTATTCTT ATTTGAATAT 

7 6651 CGTTTCAGTG AGTCAGGAAC GCACGTGTGT TAAGGAACGT AGAATTTTAA 

7 6701 GTTCTGATCC TTCGGGGAGG TCATTCACTG TAATCCTCAC TGATCTTCCT 

7 6751 GAAGGGAGTT CGGATATCCG CAACTTGCAG CTAGCGTCAG ATAGGATCTT 

7 6801 AGTTTCTAGT GCTCTCGATG CTGCTGATGC CTGTGCTTCT GAATGTAAGA 

7 6851 TCTTAGAATA TGAGGATCCC GAGCAAGAGT GGGCGCAACA GTATGCGTCG 
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7 6901 TTCTATAGAA ACATCGACAG 

76951 AGGAGAGCCT TTAGGGGTCT 

77001 ACATCGTATT CAATCTCAAT 

77 051 AAAAAACGGG ATCTTTTTGC 

77101 GCGACGTGCT TTAGAAGGTT 

77151 CTATACAGCC TCAAGTCGCA 

77201 GACGAGGCTG TGTGTGCAGC 

77251 GGAGAACAAT ACAGATGACT 

7 7301 GATTGATGGG AACTAATTAG 

773 51 AACCTTTTTA TTAGTCCAGC 

7 7401 CTTAGGTCAT TCTGGGTTTT 

77 4 51 TTGTCTATAG AGGGTATCGA 

77 501 TTTGTATAGA AAAATTTTAT 

77551 ATTTTTCAAA TAAAAACGTT 

77 601 TTATTTGTTT TAATGTCTTA 

77651 ATACATAAAT TTCTATTGTG 

77701 GTCGTTCGAC AGCATTTTGT 

7 7751 TTCTCGGATT ACAAATTTTG 

77 8 01 TAGGACACAT TGTCATGGGA 

77 851 CACACCGTTC GTCATGGAAT 

77901 AGTAGAACAA ACACGGGGTC 

77951 TAAGTAGCTT GAGAGTCCCC 

7 8001 GGGAGAACCC CAGAAGATCC 

78051 CCAACTTCTC CCTGATGAAG 

78101 GCGTTCGTAG TAGGTTAACC 

7 8151 ATTCAAGATC TTGCTCTTGT 

78201 CATAAATTTC GTGCGTCTTG 



GGCAGGCGAT CTTCAACGTC AGGGGATTCC 
CAGCATCTAC GAGAGTAGTT TTAGAAAAGG 
GCGGTAATCC AACAGGCCAT GTGGAAGTTT 
TGTAGAAAGT CAGGCTTTAG GAGATGACAT 
ATATCGGCAG CAGTCTCTTA GTTGAGGGGA 
TGTAATGTCA ATGTGAGTTT TGCTACGTTA 
TTGTGACTCA GCTCAAGATG CACCTTCTGA 
AAAGATCGCA ATCTTGTGAA CGAAATCGCA 
ACACACCTTT CTAAGGTGTT TGTTTTGATG 
AGAGCTCTTT TTTGAAGATT CTTCTTTTTT 
TTGAAGGTAT CGAGGGTTCT TATTGTCTAG 
GGTTTTTTCT CTTAGGTATC CCACGATTCT 
GAAAGCTTGA ACTCTTTACA CTGACTTTTT 
TTTAAAAATA TTATTATCAT AATTAGATAC 
TTTGATTAAA ATAACTTTGT TAAAATTTTT 
GCTTGTCCAA GTATTTCTTC TTGGTTTACT 
AAACGCCTTT GATTTCACCC ATCCCGTTTG 
CTTTGGGGAT CATTAAGGCA ATTCCCGTAT 
ATCGAGTGGT TGATTTCCTG GATTCCCAGA 
GTTTACTTCT GATGTCTCTA GTGCTATTAA 
ATAATTGTTT AGCTCCCCTA GAAGCCTATT 
ATTTCCCAAG AAGATCTAGG CAAAGTACAC 
CTTCGTAGAT ATCACACCCA CAGAAATTGT 
AACTCTCTAC TGTAGATGAG GCACTGCAAG 
TATGCCTATA GGTCCGTAGA GAAACCTATG 
GGGTTTTGGT CTCCGAGATT CTGCGGACCT 
CTAATGGCGT GCAGAATCAC TATCCCCATA 
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7 8251 CTAAAGTGAA GCTCTATTTA GCGAAGAACT TGGCAGATGT CTGGGACTGT 

78301 GAAATTTCTG AAGAGGAAAA AGGGCAACTC CGAGCTCTAG GTTTAGACCC 

7 8351 TAAAATAGAG AGTATATCCC TTACGAGTGC AGGTCTTCCT TCAGTGCCAG 

78401 AAGTCGCTAC TGTCGATTTT ATGATTACCT GTTACGGGAA AGATCAGGAA 

7 8451 GTCCAAGATC CCTAGGTGAT ACAACATCTT CTAAACTTTG CTCTAGAAGA 

78501 GACCCCTTCC ATTTCCGTGC AATACCAAGA ACAAGAGAAG CTCTCTCCGT 

78551 GCGATCATTC CCCAGAAATA GGTAAAAAGA AAAGATGGAA TAAGCTGGAA 

7 8601 TCCTTCTCCA CGTATTGTTC TCTGTTTATG TCTGTTAAGG ATCATTATAA 

7 8651 GCTGAATCTA GGAATTCAGA ATTCCCTGTC AGGGTGGCTT CTGGATCCCT 

7 8701 ATAGGGTTTG CGCGCCTTTA TCTTCACCGT ACTCGTGTCC TTCCTATCTT 

78751 TTAGATTTGC AAAACAAAGA GCTACGTCGT TCCCTTCTGT CAACGTTTCT 

7 8801 AGACCCTAAA AATCTCACTA GCGAAACATT CCGTTCTGTC TCTATAAACT 

78851 TTGGCAACTC TTCGTTTGGA CAGAGATGGT CAGAGTTTCT ATCTCGTGTT 

7 8901 CTGCACGACG AGAAAGAAAA GCACGTAGCT GTTGTTTGTA ATGATGCAAA 

7 8951 ACTTCTGGAA GAAGGATTGT CCCCAGAGGC ATTGTCTCTA TTAGAAGAAG 

79 001 ACTTAAGAGA ATCAGGGTAT TCGTATCTAA ACATTCTCTC GGTGAGCCCC 

7 9051 GAAGGAGTCT CCAAGGTTCA GGAACGTCAG ATTCTAAGGC GAGATCTCCA 

79101 AGGACGGTCC TTTACTGTCA TGATTACAGA TCTTCCTTTA GGTAGCGAAG 

79151 ATATCCGTAG TTTACAATTA GCCTCGGATA GGATTTTAGT CTCCAGTTCT 

79201 CTTGATGCCG CGGATGCATG TGCTTCGGGA TGTAAAGTCT TAGTCTACGA 

79251 AAATCCAAAT GCATCCTGGG CTCAGGAATT GGAGAACTTC TACAAACAAG 

79301 TTGAGAGAAG AAGGTAGTGT TTCTTTCAGA GAATATTTCA GAGCCTATAT 

79351 GTGTGATAAA ATCGTGGCAC AGAAGAACTT CTTATTTACT TTAGACGCTG 

79401 TAATTAAACA GGCQSGTTGG AGATCACAAG AGAAACTCAA TTTATTTTAT 

79451 GTTGAAAGTC AGGCTTTAGG AAGAGAAATC AAAGTCAGCT TAGAGGAATA 

795 01 TATTCAGAGT ATGGTCGGGA TTTTGGGATC TCAGAGAACC AAGAAAAGCT 

79551 TTAAGTTTTC TGTCGACTTT ACCCCTTTAG AGCAGGCTCT ACAAGAAAGA 
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79601 TGCTCTTCTG ATGATGACGA AGATGCAACA GCAACTTCGA CCGCTACAGG 

79651 GGCAACAGCA TCTCCGACTG ACATGCACGA AGATGAGTAA CGTTTGTCTG 

79701 ATACCTTAAA AGTTCCTTGC AAAGGGCTCC CTGAAAACTA AATTCCCTCA 

79751 GAATCTCGAA TTCTCCTGAC TCTGAAACAA TCTTAGGTTT TCCTGAATAG 

7 9801 AATCTGACTG AAATTTCTGC TCGAATCTAA GGGCTGTTTC TTATTTTACC 

79851 CCTAGATGAG GATATTAAAT CCAAGCTAGG ACTTCAAAAG TAGTTGGTTA 

79901 TTAGTTTATT AAAGAAAATA ATACTAAAAA TATTTAAAAG CTGTTTATTC 

7 9951 AATTTAATTG ATATTTTCTA TGTTGTTATT TAAAATTGTT TGTTTCTAAT 

800 01 TTTATTTTTT TTGTTGTTAT GCCAATTCCC TATATTTCTT CTTGGATTTC 

800 51 TACCGTTCGA CAGCATTTTG TTAAGGCGTT TGATTTCTCT CGTCCCTTTT 

80101 GTTCTAGGGT TACGAATTTT GCTTTAGGGG TCATCAAGGC CATCCCTATT 

80151 GTAGGACATA TTGTCATGGG GATGGAGTGG TTAGTTTCTT CCTGTGTTGC 

80201 CGGGATTATT ACTAGGTCCT CCTTTACCTC AGATGTCGTT CAGATTGTAA 

80251 AGACTGAGAA GGCGTTAGGT CGAGATCATA TATCTCGAGT GGCGGAGATA 

80301 TTGCAAAGAG AAAGGGGGAC CATAACTCCT GAGAATCAAG ATAAGGTGCA 

80351 TGGGAAGTTT CCTGTCTGTC CTTTTGGTCG TTTAAAATCC GAGGAAACTT 

80401 TAAAACTTAA GCCGGGAGAA AGAGAGGGAA CTTTAGATAC TGTATTTTCT 

804 51 CCGATTCGCA CGCGCGTGAC TCGTGCGTAC TTACAGGCCC CCCGACCCGA 

80501 AATACGTACG ATTTCTATTG TGGGTTCGAA ACTTAAAACT CCTCAAGATT 

80 551 TCTCGCAATT TGTGAGTCTC GCGAATGAAA CGCAGAGACT GCATCCTGAA 

80601 GCGTTAGTTT GTCTGTATTT GACAGGCTTG AATCGCGAAT CTCAGATGTG 

80 651 CGATACAACT ACTGCAGAGA AGAAGCAGTA CCTACATAAC TCAGGTCTCG 

80701 ACTCTAGAAT CCAGTGCAAA GACAGTAAAG AAGACGACGC TGGCTCTCCT 

80751 GAAAATCCCG AACTTTGGAT TGGCTATTAT TCACGAGAGC AACAGCATAA 

80 8 01 TATAGACGGG CAGTATATTC AGCAGTGTCT AGGGAAGAGT GCAGATCCAA 

80851 TTCCTTGGAT TCATGTTACT GAAGACACAA AGGATTTTTA TTACCCACCA 

80901 AACTTTACTT CATACTCACA TACAAGACAA TCTACAGACC CAACATCGCC 
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8 0951 ACCAAGACTC CCTGAAAGTG 

81001 TGAGTCGATC GTATCACCAT 

81051 GAGGATGCAG GACTCCTGAT 

81101 CCAAGGGCAT TATTGTCATT 

81151 TACGAACTTT AGTCCTTTCG 

812 01 GAGGATCTTC GTCCTGTAGC 
81251 GGACTCGTTA TTTTTCCGCC 

813 01 TAGTTACCCT TGCCCACGGA 

813 51 TCAATGAACA TTCTGACCAG 
81401 GAACATTTTC TCCTATAAAT 

814 51 TCTTTGGAGA TCGTTCTGAA 
81501 GATCCCATTA GTGCAGCAGA 
81551 TATGGTTGCT AAGGATCTCC 
81601 GTTCCTGCAT TCAGTTTTCT 
81651 AGACAATGGG AGGCACGTGT 
817 01 ACCAGTAATT TATTCTCAGG 
817 51 AGAATTTTGT ATTTTCTCTA 
81801 TTCCGTTCGA AAGGTCTTCT 
81851 GTTCTTAACT GCGATATTTT 

819 01 ATATGGGGAA AAGAACTACC 
81951 GAGCTAGATC GCATGGTGCA 
82001 AGGCAATGAT CCTACGCGTC 

820 51 CCTCGCAAAA TGAAGGCAGT 
82101 ATCTCCAGAT GGAATCCGTG 
82151 TGCCTAGGAA TATCTTAATT 
82201 AGAGGACTCT CTCTATATTT 
82251 CTGATTTCCA GGATAGGATT 



AGGGGGATAA GGATTCCTTG TACGGACAAC 
GAGTATATGC TTGGTTTGGG ATTAAAACCA 
GGACCCGGAT AGAATCTATG CTCCTCTATC 
CCTACCTTGC GGATATAGAA AATGAGGATC 
CCTTTCCTAG ATCCTGGCAA TCTTAGTAGC 
ATTCAATATC GCTAGATTGC CATTAGAATT 
TTGTTGCGGG TCAGCAAGAA GGGAGAAACA 
ACTCCTCGTC CAGAAGATCT TGATCCTGAC 
AAGATTACAA ATGTCTGGAT ATAGCTATTT 
CACGGAAAAT GATTGTAAAA GAACGTCAGT 
GGGAAGTCTT TCACATTGAT CTTATTTGAG 
TTTCCGTTGT TTGCAGCTAG CTGCAGAAGG 
CCAGCGTAGC AGATATTTGT GCCTCTGGAT 
GAGATGCAGA GTCCTCAGGC TATTGAATAT 
CGAAGATGAA GCAGGAGAAG AAGCCAGAGA 
ATCAATTGAG CAGCATGCTC ACTACACAAC 
GATGCTGTGG TAAAACAGGC GATCTGGAGA 
TACTATGGAA AGAAAGGCAC TAGGCGAGGA 
CCTATTTAGG GAGTCAGGAG CGTAATGAGA 
GAAGAACATG AGGTCGTTAT CAGCTTCGAA 
AGTCCTCCCA GCCGAAGTCC CTGCAGATTC 
CCGTTCCTAA TCCAGATAGT AACCCTGATT 
TAGAAAGTAA AAATACTAGA GAAATTTCTT 
GTCCATGTAC CTAGGATTCC AGGAAGGGTT 
TCACAAACCC CAGGAGTGCA CAAACTCCTT 
GTTCTTTCTA CACTAGTATT CCGTAGATTT 
CTAAATAGTT GTATCTAAGC GCTTTTTACA 
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82301 ACCTTTCTCA GGTCTCCTCT TTCTAATTTT AAAAATAGCG AATTTCTTGT 

823 51 GGCTATAGGA GTCGCTAAGT ATCTTAGAGA TTGCTTTTTT CAAGTTTTTT 

824 01 TTAAATACTT CTTTCCTAAG TATTTCTTCC TAGTCGTGTT ATGGCTTCTT 
824 51 GTTTATCTGC CTGGTTTTCT ATAGTTCGTG AGCACTTTTA TCGAGCCTTT 
82501 GATTTTTCTT TGCCGTTTTG TGCTCGTATT ACGGAATTTG TATTAGGGGT 
82551 CATCAAGGGG ATCCCTGTTG TGGGTCACAT TATTGTTGGG ATAGAGTGGC 
82601 TCGTTTCTAG GTATTTAGAG AGTTTCGTGA CCAAGCCGAC ATTTGTCTCT 

82 651 GATGTGGTGA GTCTTCTGAA AACAGAGAAA GTTGCTGGTC GCGATCACAT 
827 01 TGCTCGTGTA GTGGAGACTT TGAAGAGGCA GAGAGTCGCT GTGGCTCCTG 
82751 AAGATGAGGA TAAGGTCCAT GGGAAGATTC CTGTGCATCC TTTCGGGGGA 
82801 ATCCAACCTG TAGAAGTTCT CACTCTCTAT CCCGAAGTTC AAGATGCAAC 
82851 GTTAGGGCTT GCCTTCTCTA AAATTCGTAA TCGTGTAAGA CAGGCGTATT 
82901 TGCAAGCTCC ACGGCCAAAA CTGCAGAAGA TTTACATCAT AGGAAACGAT 
82951 ATGAATCCTT TTGAAGTTGA CGACTTCTTG CATCTAGCCC GTCTCTGTAA 

83 001 TGAAACTCAA AGACTCTATC CTGACGCTAC GATTTCTCTA TATCTAACAG 
83 051 CTTCTGGTGG TCGCAATGCT ATGGACAAAA AGAATCGGAA GTTACTTAGT 
83101 GATTGCGAAC TAAACCCCAA GATTGCTTGT TTGGACTTTA ATCAGGGTGA 
83151 TGTAGTCAAA CAAGCAACTT GTGACTGTTG GATGGTGTAT CATGGGGAGA 
83201 ATGATCAAGG TACGTTGAAT CAGATTCAGG AAGAGTTAGA AAAGTCAGGG 

832 51 GAGGAAACCC CTTGGATTCA TGTGGGGCAA AAGCCTCTTT CACAATCCTT 

833 01 GTGGGATTTC TCTCCATTTT CATCTTTGGA GATGAAGGGA GATAAAGAGA 
83 3 51 AAGCTCTAGA GTACTCTGAA TTAGAAAAAG AACAGCTATA TTCTCGATTG 
83401 GTATACGTAG GAGAGCGCTC TTCGGTTCTT AGTTTGGGGT TTGGAGATAG 
83451 TCGGTCAGGG ATCTTGATGG ACCCAAAACG GGTGCATGCT CCCTTATCTG 
83 501 AAGGGCATTA TTGTCATTCC TACCTTGCAG ACTTAGAAAA TCCCGGGTTA 
83 551 CAAAAAACAA TTTTAGCGGC ATTTCTGAAT CCTAAGGAGT TGAGCAGTAC 
83 601 CATACTGCAA CCTATATCTC TAAATCTTAT CTTAAATAGC AAAACTTACT 
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83 651 TAAGGCAGCA CTTTGGCTTT TTTGAGAGGA TGAGCAGAAG TGATCGCAAT 

83701 GTGGTTGTCG TTGTATGTGA TTCTTGGTGG GGTACCGACT GGAAGGAGGA 

83751 GCCAAGCTTC CAACACTTTA TTATGGAGCT AGAGTGTCGA GGGTATTCGC 

83801 ACTTCAATAT TTTTGCCTTT AGATCTAATA GCATGTGTGT AGAAGAACGT 

83 851 AGGATCTTAA ATGAAAGTTC TCAAGAGAAA GCCTTTACCA TGATTTTCTG 
83901 TGAGGATTCA GTATCTCAAG GAGATATCCG CTGTTTGCAT TTGGCGTCTG 
83951 AAGGAATGCT TTGTGGTAAA GAGTGCTATG CTGTCGATGT CTATACGTCA 
84001 GG ATGCGCGA ACTTTATGAT GGAAGAAGTC TTAACTTTGG AGCGAGAATC 
84051 TAATCTGTGG AATAGAAAGC ATGGTCTTTG GAAAAGAGAA GTTAGAAAAC 
84101 AGAAACAAGA AGCTGCTTTG GATCAAGACG AGAGCGAGAT TTACGTTTGT 
84151 AATCAGCTGA CGGCGCAACA GAACTTCGCT TGTTCTTGAG ATGCTGCAAT 
84201 CCGCCAGTCT ATATGGAGAT CCCGTATGCC AGAACTTCTC TCTATTGAGA 
84251 GACGGGCGTT AGGGGAACAA CTCTTTACTA CTGTACATCA CTACCTAACA 
84301 ACGCAAAAAA AGATCCTCAG GGGAATCTAG AAACGCAGCA ATCCGCGCAA 
84351 TTGTCTATAG ATTTCACAGC ATTAGATGAA GCTGTTGAAT CTCTAGGATC 
844 01 GACTCTTAGC AGAGCTCCTT CAGAAATATC TCCAATTCCA GAGGAGGAAG 
84451 CTCACTTAGG AGCCAACAAA TAGAGACAAA GAAAATTCGA CGGTTTGAGG 
84501 ATAACGATAC GCAATGTCTA AGCTTTGAAT CAGGATACTC TGCTTTACAG 
84551 GCGTGTCTTT GTGCCTATGG TCTCTCTCTC ATAACAGAGT CTCTCAAATC 
84601 TAATTGTCAG AACCTATTTC CCCTAAGAAT CGATAACGTA TTTGTTAGGA 

84 651 AAACGTGCTC AACAAGAGCG TTTTATTTTC AGTGTACTTG ATGTCTAATA 
84701 CAAATTGTTT ATTTTGTATT TTTGGCACAT CATTTAAATT CCTTGTACTT 
84751 TTGAATCTAA -ACGTAAATTT CTTATGACTC ATTGCTTACA TGGTTGGTTT 
84801 TCTGTAGTTC GTCATCACTT TGTGCAGGCG TTTAATTTCT CACGTCCTTT 
84 851 ATATTCTCGA ATTACCCACT TCGCTTTAGG GGTGATTAAG GCCATCCCCA 
84901 TTGTAGGGCA TCTTGTTATG GGAGTCGATT GGTTGATCTC TCATTGCTTC 
84951 GAGAGGGGAG TCTCACACCC TGGGTTCCCT TCAGATATTG CTCCTATACT 
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85001 


GAAAGTAGAA 


AAGATCGCGG 


GCCGAGATCA 


TATTTCTAGA 


ATCGAAAATC 


85051 


AGCTAAAGAG 


CCTTAGGAAA 


ACTATCGAGG 


TTGAAGATCT 


AGATAAAGTC 


85101 


CACGGGCAAT 


ATCAAGAGAA 


TCCTTATGCA 


GATATGGCCT 


CTAGTGAGGT 


85151 


TCTTAAACTC 


GATAAGGGAG 


TTCATGTTAG 


CGAGCTTGGC 


AAAGCCTTTT 


85201 


CTAGAGTTCG 


CAATCGCATC 


ACCAGATCCT 


ATAGTTATGC 


CCCTACTCCT 


85251 


CAGTTGGACT 


CTATAGCTAT 


TGTTGGTATA 


GATCTCGTCA 


GTCCTGAAGA 


85301 


ACAAGAGAAT 


TTAGTACGCT 


TGGCGAATGA 


GGTCATTCAA 


CTCTATCCCA 


85351 


AATCAAAGAC 


AACTCTATAT 


CTTCTTATCG 


ATTTTAATAA 


GGAGTGGGTA 


85401 


GGGGATATCT 


CCTCTGATAA 


GGAAAAACAG 


CTCCGTTCTC 


TAGGTCTACA 


85451 


TTCTGAAGTT 


CAGTGTCTTT 


CCGTCTTGGA 


ACCTCAGGGT 


GCCGAGGGCG 


85501 


AAGATACGAA 


ACACTTTGAC 


CTTATGGTCG 


GCTGTTATGG 


GAAGGATTCT 


85551 


TACTTAAGGG 


AGGGTAAAAT 


TTTACAGCAG 


GCCCTAGGGA 


CTTCGTTAGG 


85601 


TACTGTTCCC 


TGGGTGAATG 


TTATGCACAC 


ATTGCCATCT 


AGGTATAGAT 


85651 


CTCGGCTTTC 


CTTACCTATA 


AATACCGAAA 


AGGATAAGAC 


AGAGCTTTAT 


85701 


AAAGAGATTT 


CTCGTACACA 


CCATCAGTTG 


CATACTTTGG 


GAATGGGACT 


85751 


TGGAGCCCAG 


GATTCAGGAT 


TGCTCTTAGA 


CCGGCAACGA 


CTCCATGCTC 


85801 


CTTTATCTCA 


AGGGTCTCAC 


TGCCATTCCT 


ATCTTGCAGA 


TCTCACCCAT 


85851 


GAAGAGCTGA 


AAATTTTGTT 


ATTTTCAGCA 


TTTGTGGATG 


CTAAGAACAT 


85901 


AAGTAAGAAA 


GAGCTTCGTG 


AGGTATCTCT 


AAATTTTGCT 


AACGATACTT 


85951 


CCGTAGAGTG 


TGGCTGCGCT 


TTTTACTTTT 


AGTGTCCTAT 


GATGAGAAGG 


86001 


AGAAAGACGT 


AGTTGTCGTT 


TGTAATCATT 


CTGAACCTAA 


TATCCTCGGC 


86051 


CTGCCTCCTG 


AAGCAGTCTC 


TCAGCTTATT 


GAAGAGCTTA 


GCGATGAAGG 


86101 


CTATAGCTAT 


CTGAATGTAG 


TGCGTTGTGA 


TCTCTCCGGG 


GAGACTACGG 


86151 


TTCAACAACG 


TCTGCTATTG 


AATGCCGATG 


AAGGGAGATC 


TATGACGGTG 


86201 


GTGATCTCAG 


AGCTTCCTGA 


AGGGCACCCC 


GATATTCGGA 


ATTTGCAGTT 


86251 


GGCATCCGAA 


AGAATTTTTG 


TTTCTCGTGA 


AAAAGAAGCT 


GCTGATGCCT 


86301 


ATGCTTCAGG 


ATGTAAAGTG 


GTCGCTTTCG 


ATGATGAGCA 


TCTCCCTTGG 



86351 GTCTCCAGTC ATATTGCCTA CGCGGAGGAG ATCAGAGAGA AACAAGAACA 

86401 AACAATGCAA GGGTCTTTAA CTGAAGAGCA GTTAGGAGCA CTCCTCTGCA 

86451 ACACAGTCTC CACAGAGAAA AATCTAGCCT TTGCTCTAGA CGCCGTGATA 

86501 AAACAGTCTG TGTGGAGATT CCGCAATCCG GATCTTTTTG CTTATGAGAG 

86551 AGAAGCTCTA GAGGCTTCAG TAACAGATGC TTTAGTATCT TACGTTTCAA 

86601 ATTTAGACAT GATACCGTAC ACAAGTTCTC AGGGCATAGT CATAGAAGAT 

86651 AGTAGTATCG TCCGTACCTC TCAAGAGCAT ACACTCATTG TGAACTGTGC 

86701 AGCATTCGAT AAGTTAGCGA GCCAAATAGA GTTCTTATGC CCCAGTGACG 

86751 TGTTGCCCAT TTCTGGTAAA GACCCTTTGA TTTCTGATGA TGAGGATGAG 

86801 GAACTGAATC CTAAAGTTTC ATCTGCTGCA GACTCTAAAG ATAAAACCTA 

8 6851 GGGAGTGAAT TCTACACGAG AATCGAGAGG AGAGCGAGTC TTTCAAGGAT 

86901 TCATAATCCT TGTTAACGTA TGCATAAACA AGTGAAGCCA TTGCAACGTG 

86951 AAGTAATCGC ATTGATGAAA GATGCTTTCC CTAGGGATGC AAAGCAGATC 

87 001 GTATTCCCTT CTTTCCAAAA CAGACTATAG ATTCAAAAGA TATTCTTTCT 

87 051 TTTCAAATAG ACTTGAGAGA GGGGGGGGGG TTCTCATAGA GTGAGAATCT 

87101 TGGCCTTCAT TGCTAAGTTC TTCGATGATG GATAGAGGAT TCTAAGACGA 

87151 CCGGGGCTAC AGAAAACTCT AAGCAGAGCT TAGAGTTTTA AAATGTGGAT 

872 01 TTTAGTCCTG TAGACACTCG GTGGTTTGTA AATCCATTTT TCCCGTCAAA 
87251 GGTATAGTTT AGAAAGGCCT GAGTGTCCTC GGTGAGATCT ACTACATCAT 

873 01 GGATTTGTAC AAACAAACCG TGGCGGCGTA GATTTGCTCC TGAGATCGAA 
87351 GTGCTCTCTT GGTTTGAGAC GACAGTCACA ATATTGTGAG GGTTGACACG 
87401 ATAGATATCA GGCTTGTAGG CAAGTTTGAT GGTCAGTGTA GAAGGAGCCT 
87451 TCTTAAATGG TGTGAACCAT TGAGAAGAAC ATCCTATCGG TAGGGAAACA 
87 501 TTGTACCCTT TACCTCTACT AAAGCTACGT TGCAGATCTC CAGTTTCTGT 
875 51 GAACTTACTT TGCCAACCAC CTAGGAATTC TGCGGAAATA AAACCTGAAA 
87 601 GATCCCAAGC TTGAGCCAGA GGTCTTGTAA GAAGACACCA GTTTAGGAAA 
87 651 GGATGTTCTG CAGAAATAAG AACATAGTAA CTATTGTTAT GCCATTGCCC 
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87701 


TTGAGATTTT 


GGAGCTTTGT 


CAGGTCTGAG 


GTAGGTGGTA 


TTTAGGTGAT 


87751 


TTTTGGAATA 


ACCATAAGCT 


GCTTTCCAGG 


AAATTAAGGC 


CTCGPTPTTT 


87801 


TGAGTCACGA 


TAGGGAATTG 


ACCAAAGAAC 


GAGAGTAAAT 


ACATTTGTTC 


87851 


TGAGCAACGT 


GAATCGTAGG 


GGTTGGCGTT 


AGTTTTTCCA 


TAAAGCTGCC 


87901 


CGAAAGAAAG 


TCCTAACGTA 


GTGTGGTCCG 


TGTAGTTCAT 


AGATAGCGCA 


87951 


GCTTGGTAGC 


CTCCATAGCG 


ACCTGAAAAG 


CCCTCATGTC 


CTTGTCTTGG 


88001 


TGTGTGTTCG 


ACATAGGCTC 


CTAAAGCTTT 


CGCGGTTATG 


GACAACCCGG 


88051 


GATGATCTAT 


CAAAAGAACA 


TCTTGGAGAA 


TATCAGAGAA 


TGCCTGATTT 


88101 


CCTAAGAAGG 


AAATCCATAA 


GCTGTTGCTG 


ACAATTTCTC 


CGTAAPGPTP 


88151 


GGGATCTAAG 


ATATAGGTAG 


AACGCACGAG 


AGTGTCTGAA 


TTPPATAPAP 


88201 


CATAGAGAGT 


ATTTGCGCTA 


GGAGAGGGAC 


CTCCAGGAAA 


TPPTTVATPA 


88251 


GGAGCTGGAA 


TTAACAGGGG 


ACGGGACCAT 


GTGTAGGACC 


APTTTPPTTP 


88301 


GTAGCCGTAG 


TGGCTTGGAG 


TCGCAATCTC 


CCCATCAGGA 


AATPPTPTPT 


88351 


TAGTAACGGT 


TGCTCCTTTG 


AAAACAGCGA 


TAGGAATTGC 


TAPTPPAPTT 

x — x oonu i. x 


88401 


TGTAATGACA 


CCATATCATA 


AAGATCTGTA 


ACGTCATGTT 


CATCAAGAAP 


88451 


CAGAGCTCCT 


GTTAAAGTGA 


CGTTTTTTGT 


GCCTGCATTT 


ACTGATPPTP 


88501 


AAACAAAATC 


TCTTTTTAGG 


AAGGAAAAAG 


GATCGAATGC 


TAAPTTTPPA 


88551 


ATCGTAAAGT 


CTACAGCGGC 


AGGTGCTCCC 


vJ X \J\J\J X \j X X \J 




88601 


GGTTCCTCCA 


GAGCCCAGGG 


TAAGCTGACC 


TGAGCCCTGA 




88651 


CAAGAACATT 


GACAACCGCA 


TTGTCAGTAA 


TCTTCAGTTC 


TPPAPTAPrr; 


88701 


ATCTTGACTG 


TTCCTAGAAG 


TATAGTTGTC 


GTGTTGGCAG 


GPAAPAPPAP 


88751 


TTCTGTAGAG 


GAGAGTCCCT 


TACTTGTAAA 


GACTACAGAT 


PPTPAAPPQP 


88801 


CATTAGCGTT 


GATTGTAATG 


TCTTTATTAG 


APGGAPTTGT 




88851 


CTATGTGTAA 


TGGGATCATA 


AAATACAAGA 


CGTGAGCCTC 


CTTGTGCAGA 


88901 


TAGAGACACA 


ATCTCTCCCC 


CTGCTTCTAC 


AGTGATGGCA 


TTGCGGATTC 


88951 


CAGGTTTTGT 


ATTGAGCATG 


TTTCCTTGGA 


ACGCAATATC 


ACCTTCGGAA 


89001 


GCCAAAATCG 


TAAGTGTCGA 


TGTTTGCTTC 


GCAGGGTCTC 


CAACAGAGGG 



89051 GCCTATGAAA ATAGCTCCAC 

89101 TAGGACCGCG TCCTTGAATA 

89151 TTTCTCGCTG CAGTGTTGCT 

89201 GATGTTGCAG GTTTCGGCGT 

89251 TATAAGAGAA GGTGCACTTC 

893 01 GTAGGGATGC AGATGGCTCC 

89351 AGGGGTGACG CCTTTTAAGC 

89401 TGCATCCAGA ATTGTTTTCA 

89451 CCGCCTCCTG ACGAAGTAAA 

89501 ATTGATGAAT TTAACGACTG 

89551 AGTTGGCAAG GTTCCCAAAA 

89 601 GTGAGTGAGT CTGTAGGTTG 

89 651 GATTGCGGGA GTTGCCGACG 

89701 GGTTCGTAGT GAAGATCAAA 

89751 CCGTAGATGA GGCCGCCAAG 

898 01 AGAAGAAAAA TTCTTGAAGG 

89851 AGTTACTATT TGTAGGAGCT 

89901 TTCCCTAATA AGGAGAGATT 

89951 ACTCGCCCCC AAGAAGTTTG 

900 01 CTTCACCCTG ATTCGTAATT 

90051 TCAGCAAACG CGCAACTTGC 

90101 TGAAGATTTG AAAGAAAGAG 

9 0151 TTAAAATCGT AATTTGCTTC 

90201 TGCGAATATC ACATCCACAG 

90251 GAAGTATGGC TGGATGCTTG 

9 0301 CCCTATAGAG GTCCACGTAG 

903 51 CAGGATTGTG ACGATAGACA 



CCTTCTCCGC GCGGTTTCTA GAGAATTCAA 
TTGAGCACTT TAGCACAGAT GGCTCCGCCA 
ATCTAGGAGC AAGGCACCCT GGTTCCCTAC 
AGATCGCACC CGCATCATTT GGTGTACCAT 
CCCTGATTGT TTTTTAATTC GAAAGTTCCC 
GCCACTTCCT AAAGCATAGG TTCCTGATGA 
TGTTCACACA GGAGTTGGCG GTGAAGATGA 
AAAAGGAGAG AGCTTCCTCC ATAAATCACG 
GTTATGGGAG AAGCTCATGG TAGCGGTGTT 
CCGTGGGAGA ACTGCTAATT GCAGAGCCGA 
AACTTGATCG ATTGGGATAT GTTTTCGACA 
GAGAGCAGAG GCTCCTGTAG TTACTGTAGT 
TAGTTACAGA TGCTGGAGAA TAGGCAACAC 
TCTTTGATAG ATTGGAAAAC AATATCTTTT 
TCCTGTCGAT TGGTTCCCTG TAAAGTTTAT 
TCAGAGTCTC TGCGGCAGAA AGTAGCGCAT 
TGACAAGAGG TAAACGTTAA GGAAAGGCCC 
GCTGGAACTA TTGATAAAGG AACTTGAAAA 
AACAAACAAA ATCTGAAGTG AGTAAGATCT 
GGAGGAACAA AGTTCCCTCC GAGTCTAGTC 
ACTACATAAA CAGGCAAGTA GACAAAAAGA 
GCATGCCTCA ACCCTGTCGT TAAATAAGGT 
CTATATCTAG AGTATATTGA CGGGAGGTTC 
TGCCCGAAGA TCTCTAGAAC ATCATTTACT 
TACTAGCAAA GTACTTCTGG TTAGATTATT 
CTCCATTAAT AGGTAATGTC GTATCGCAAT 
TCAGGAGCAT AGGCTACGAT TATAGTGTAA 
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9 0401 AAATCTGGTC GATTATGAGA 

90451 TGCAACGTTG AGTAGATGAC 

90501 TTTCTTGGAT GCCCCCATGA 

9 0551 GGAATGATCT GCTTGAGGTT 

9 0601 GTTCCCTTCG AGTTCTCCTA 

90651 CTTTAGGCAA TTTTGTATAT 

90701 TAGGTAACTC GAGAAGTGCC 

90751 AGACTTGGTG ATGTTAGAGT 

90801 CGTGACCTAC GAGGTAATCC 

90851 AGACTTAACG CAGCATCTGG 

909 01 GCCTCCACCC ATATGGCGGT 

90951 GGAAGAAATT TGTAATTCCT 

91001 TCGCCATCAG CTGCTGACGC 

91051 TGCCCATAGG CTATTAGGAA 

91101 GATAACCGGT TTTTTTCCAA 

91151 ATGGTAAACT CCCAAGTTCC 

912 01 GTTTGTGCCG AGACTGAAGT 
91251 TGAGTTCTAA AATCGGAACT 

913 01 TTGTCACAGC AATCTTGAGA 

913 51 AGTGAGAGCT CCATTGGTAC 
91401 CATCCAAAGA ATCCAGATTG 

914 51 TCAGTATTAT TAGCTCCATT 

915 01 GATGACGACG GACTTTTCAT 
91551 CACCGTTTCT TAAAGCGAGA 
91601 AATGTAGATG TAQCATTTGC 
91651 TTCTCCTGAA AAGACAATAG 
917 01 TGGGATTGAC GACCACAGTA 



ATTTTTACCA AAGCGGACGC CTACGGGAAC 
CGTGTCCAAA AATCCTCCCC TCGGGGGTAT 
GTCGCGTAAG CAACTTCAGC TTTTACAAAG 
TAAGATGCGA GAAGAGAGAG TGATGGGAAG 
ACCAGCAATT GTTACTCCAA GAGCAGCGCC 
GAAGTCTTTA CTTTCTCATT GCTACGGCTA 
TCCTGAGAAG AATCTCGATG ATCCAAACAG 
ATACTGTAGC GAAATAAACG TTAGAATGAC 
TTAGATTTTG TAAACAGCTG TCCAAAACCT 
AGTGATGCGT GTGTAGGTAT TGATGAGGTA 
AGCTGCGTGC ATCACCGGTA TGATTCGCAT 
GTGATGCTCA GTTGCTTCCC AGGGACATCT 
TTGACTTACA GCTCGTAAAT CTATGACGTT 
TGAGGGGAGC AAGACGCTCC GGATGAGGAA 
TTTCCTGTGA CCGTATGGGT TGTCGTGTCT 
TTGATACCCA TAGGGAGATT GCTGATAGCC 
CCGTAGTGGT TACAGTATTT GAAGTCGCTT 
TGCTGTAAAT CTTTATTAAA CATCCCGTGG 
GTTTTTCACA AGTCCTAAAG TTCCGGATAT 
TCTGCACATT AACGACAGCC GCTTTAGTGC 
ATTACAAGCT TGTTTAAGGT GATAGCACCG 
TGTAGTTGCT AATGTGGTCC CTGCATCCAT 
CTTGCGTGAA GTTATGAACA TTTAAGGTAG 
GTACCGCCTT CAAGTTCTAG CTTTTGGTTT 
AGGGGTTGCT GCTTCGGTAG CAGTGAGGGT 
TCCCTGAATA CGCACCATCT GCACTGGCTT 
GCGGCTGCGG ATGCAGCAGA TAAATCATCA 
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91751 GATGTAATCG GATCATAGAA GTATAGGGTA TAGCCTTGCG TAGCTCCTAG 

91801 AGTGGCAAAC TTGGCATCTT TTCCGAAGTG AATACTATTG CGAGTAGGTG 

91851 TCCCACTAGT GATGCTGAGG TTCTTGTTAA AGAGGATATC ACCTTGATTT 

91901 GCGGATAGAG AGAGCTCCCC AGATTCGGGA ATAGCAATAG CGCCTCCTTT 

91951 TCCAGCTGTA TTTCCAAGAA ATATTGTAGA TTTATTAGAA TCTATAGAGA 

92 001 TCTTTTTGCC ATAGAGGGCT CCTCCTTGTT CGGAGGCAAT GTTTTCTAGG 

92 051 AATGTAACGG TGTTTTCTCC AGATATAGTC AGGCTAACAC CTGTTGGTGG 

92101 GGGGGTAGCT GGAGGAGTAC AGAAGATGGC GCCTCCATAT CCTAACAAAG 

92151 GAGTGACTGC TGGTGGTGTA GGTGGAGGTG TAGGTGCAGG TAAGGAGTTT 

92201 TGTGGAGATG CTGTATTGTT TTGGAAAGTC AGGTCGCTGT TATTAGAAAA 

92251 TGTGACATTT CCGTTAGCAT AGATAGCGCC TCCTGAGCGC GAGCTATTAT 

923 01 TAACGAACAA GACTCCTGAG AGGTTCCCAG AGGTGAGCAT AGATCCTCCG 

923 51 GTAAGGTAAA TAGCCCCACC ATAGATCCCT GTAGCATTCG TTGAGAAAAT 

924 01 CACAGGAGCG CTATTGTTGA TGAGGTTGAT CGCTGCAGAT CCCGTGAGGG 
924 51 CCCCTCCATT AGAGATGGAT CCATTACCAT TAAAGAGAAG GCTCTTTTTC 
92 501 GTATTTTCTA TTGTGATGCT TGTGCCTCGA ATGGCAGCTC CAAATCCTGC 
92551 AGAACGGTTG TATTGGAATA GTATGGAGTC ATTGTTTGTA AAGAGCATGG 
92601 GCGTTGTAGC GTAAATCGCC GATGCGTGAG GTATGACATT ACTCGCTGAG 
92 651 GTATCTGAAG TCAAAGATTC ACAGTTATCG AAGATCATCT GACTAAATCC 
927 01 TGAAAAACTC AAGGGACATA GTTCAGGATT TTGGGTGATT ACACTACTAA 
92751 TCGCGGCTCC GTCAGCTGAA GAACGGATAT TTAAGAAGGA GAAAACCCCA 
92801 CCTTTTCCTA AGATTTGTAG TGCTCCCGCC CTATTGCTAA AGCAACTGGA 
92851 AGAGGTTCTG GATATGGCAT TATCAAGATT CGCAATGTAG AGATCCCCTG 
92901 AAAAAATACA GAGTGTCCCT CTAGGATCAG AAAGTGTTGT GTAAGGAAAA 

92 951 ATCTTCCCAC TCGATCCATC AAAGTTCTCG GAAGGCATGA TAACTTCTAC 

93 001 AGTAAACGCT GTTGAAGCAA AACATGGCGC CAGTGTGGTA GAAATTAAGA 
93 051 ACTTACGAAT AGACGTTTTC ATTTGCACGT AGAGATGAAA CCAGATTATC 
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93101 


CTACAAATAA 


. GGGAAAGGCT 


1 GTAAAAAAAC 


AAGTACAATA 


. AGACACAGTT 


93151 


TTAATCTCTT 


AATTTTGACA 


GCTTTAAGAT 


TACAGGATAT 


TTTAAAGGGC 


93201 


ATTTTCCCAT 


TTCTTACATT 


GCTTTCTTAG 


AAGAATACTT 


GATAGAAAAT 


93251 


GGCGATTCTA 


TTTTTGAAAA 


ATCTCAAGAA 


ATTCTCCCAA 


ACGAAGATGT 


93301 


TTTAGAGAAC 


CTGTAAAGTA 


GAAAATGGCG 


CTACGCCCGA 


GACTCGTGAA 


93351 


CATGTGTGCA 


TAGAGGGACC 


TATCTCTGAA 


TACAGATAGG 


TCCCAAAATC 


93401 


TTCTTAAAGG 


GGATTCCCTT 


AATTATAGTG 


TAGACTTAGA 


GATTATGGCG 


93451 


TAGAAGTGAT 


CTCAGGATTC 


CAAGTGAGGA 


AAACAGTTTT 


CTTAGTTGGT 


93501 


GTGATCCTTC 


TGTCTTTATC 


CAGAGAAGGG 


GGGTACTCTT 


CCCAGCTTAG 


93551 


GGACCAGAAA 


CCTCGAACGG 


CGTCATTCTC 


TGTTGTGACT 


TGGGTTCTCT 


93601 


CCAAAGTGGT 


CTCCCCTAGG 


AGAAGACTGT 


CAAAAGAAGG 


CCCTAGAAGT 


93651 


TCAAGAAGAG 


GAATCGTAAA 


GGCTTCCTTA 


AATTGAGGAA 


AGTCATAGAC 


93701 


GGACCAATAA 


GCCTCAAAGG 


CAACTTTAAG 


TTTTTCTAAT 


GAGACAAGAG 


93751 


CATCCTTGTC 


CTCGGCGCGA 


ATCCTTACAG 


GAACAAAGTT 


GTCGGTATCT 


93801 


TCAATCAGGA 


TGTGCAGATT 


CTGAACCCGA 


GCATCTCCTG 


AGCAGAGCAG 


93851 


AGTGGTTCCT 


GGAGACATAG 


TAAGCGTAGA 


GCTTGCTTCC 


TGCTTAAAAG 


93901 


AATGCAGTTG 


CAAGGTAACC 


CCATCCGATA 


GAGAGAGAGT 


TCCTCCTGCT 


93951 


AATGTGACAT 


CTTGTAGGAT 


TGTGGAAGTA 


AGATTTTCCG 


CACAAACTTC 


94001 


ATGATCATCC 


AGGCATAGTC 


CTGAGAAGCT 


AATTGTTCCT 


TCATAAGTTT 


94051 


CCTTTCCTTC 


AGGAGCATTG 


ATTACAAGAT 


CTGTAATTTT 


ATGCGACTCG 


94101 


CTATGGCTTA 


TAGGATCATA 


GAAATAAACT 


CCGGATTCTG 


AAACAGCACG 


94151 


TAGGTTCTTA 


AACTGTGCTC 


CAGATTGCAG 


ATGGATGGAG 


TTGTGTATTG 


94201 


TATTTCCGTC 


TTGTGATGCT 


GTATTTCCTT 


TGAAGATGAG 


ATCTCCGCTT 


94251 






1 CLTCCAGGA 


GCAATGGCAA 


TGGCTCCTCC 


94301 


ATTACTATTC 


ACGTCATGAT 


AAGCATGATT 


ATTTTCAAAA 


CACGAAGGTC 


94351 


CTCGAGTCGT 


GAGTGTGAGA 


TTGTGGGTAG 


ATATGGCGCC 


GCCATAACCT 


94401 


TGGCTCACAT 


TGTCTCTAAA 


CACCAGGTAG 


CGGTTCCCGC 


TGAGATTTAC 



94451 TGAAGGACGA CTCGCCTTAG 

94501 CTCCACTCCA CGAAGAGTAG 

94551 TCGATCATCA CGGAACCAAG 

94 601 TAGAGGTGCT GAGGTGAACG 
94651 TGGTGTCTCC AACGCGGTTC 
947 01 AGGTTGTGAA AAGTGAAGTT 
94751 AGATCGATTT ATAAAAACCC 
94801 AAATCCTCAC GTCATCTAGA 
94851 GTAATTTTAG GTTCTAAGCT 
94901 ACTTTCATGA AGATATACAA 
94951 AAGCCGAGCA GGTAAGAAAA 
950 01 AATCAATAAG GTTGAAGCAA 
95051 AAGAATAGAT TATTGTCTAT 
95101 ATAACTAGAA ATTATTAAAA 
95151 TTTTTTACAG TTTGCAAGGA 
95201 AAGTATGAGA AAACTCCCTA 
95251 TCGGAGTTTG GTACCAACAT 
95301 GTTCCATAGC GTAATGTCCG 
95351 TTGTTGCTAC CCCTCAGTAA 
95401 AATTTTCCAA GAGTCTGGGC 
95451 TGTTACGATA GACATCGGAA 
95501 GAATCTCCGA TATCCCCCTG 
95551 GTTAAGCAGC CTTCCAATAC 

95 601 AGAAGCTATT TTGTGATACA 
95651 AAGGTCTTGA AAAGAGGATG 
95701 GCCGATACCA CCAGCTATAC 
95751 GCAATGAGGT ATAGTGCGTT 



AACCTAAAAG GTAGGGAGTA TAAATCGCAG 
TTCCCACAGA AAGTCACTTC CTCACTATTT 
ACTATAAATC GCTCCTTGTC CTTGAGGTAG 
CTAAGTAAGA AAAATTAGAG AGAGTGAGAG 
GAAATGGCAG CGCCAAAACC CTCGGTCATA 
GCAACGGTTG CCCATGAAAA AAAGATTCCC 
CAGCATCTTC TTGATCATGC TTAACGTTGG 
AAGATGTAAG AAGTTCCTTC TGGATAACAG 
TTTATTATTG ATAGCACCGT TATAACCATC 
CTTGTGCTGC TGCAGGGAGA GCGAGG AATA 
TTTCGAAGTA TGGTCATGGT TTCCTCGTTA 
CTTTAATAAA CAAGAAAAAA AGAAGTCAAT 
TAATTATTTA ACTGTTTTTA AAATAAAATT 
GAAATCTTTT TTGAAGAGGG ACAAATGTTA 
AAGCATTCCC TATAGCAAAT ATTTCCCTAA 
GAAGAACTAG GGAGTTTTAG CAATCTAGAA 
CTACATTGTA GTTCCTTGAA GATCCACGGA 
AAGAGCTCAC AATTGGAGTT GTAGACGTAG 
AAATGCCTGT CTTGAAAGAT TGCCACCGCG 
TCATCACAAG AGTCGCTGTA GATTGGGGAT 
ACAAAGAATC CTGAGAGATC ATAGGTGTAG 
CACGAATTTC GCACCCACAG GAATCGAGAG 
TAAAACCACG GCCATCACTA GAGCTTTCGA 
TAAACCATTT CGACTTTCAT CTGTGGAATG 
TGGGTTGGAA AGAACAAAAG GAAGGTCTAG 
ACTCGTTGCT CCAAGAACCT TCGGATTCTG 
TCCATACGGT TGTCTGAATG GCTGAACGAA 
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95801 ACTTGGACAT CCAAGGCTAG GGGAATTTCC CTAGGGAATT TTTCTATAGC 

95851 TGATTCAGAA AACTTTGCTC TTCCTAATCT CAAATAGTTT TGGGGTTGTA 

95901 GGGTATGAGA GTGCTTGAAG AATAAAGTTC CACCGTAGGT TCTAGAGTTG 

95951 TTGTGAGCGA TAAAACAATC TTTGTCTCTA GCAAAGAGAT GGCAGAACGC 

96001 AAAGGTAAAT AGGTCGTCTT TAGGAGTGTG AGCACT^CCA CCGATGACGT 

96051 AGCCTCCAGA GGTATGACGG AAGCCTTTGC GATTTTCATC TCCAGTCTTA 

96101 TGCAGGAAGT TCGTCATGGA GGAAACCCAG AAACCTTGTT TGTGTTCCAT 

96151 ACCAGTTGCG CCGATCTCTA CAAGCTGTTG CAGAGAGCGA ATGTCAGTAA 

96201 AGACTCCCCA TAGGGTATTG CATACTAACG CAGATTTTCT TTCGGGGCTG 

96251 GGAACAAATC CTGTTTTGGT CCAAGTTGCC GTGGCCTCTT TTGTATTTGT 

96301 AGCTGTATCC GTAGTCCAAT TAACATTCCA TTGTCCTTGG AATCCGTATT 

963 51 CTGAATTAGG ATCCTCAGCA GGAACAGGGA TAAGGCTGCT GATGTCAACG 

9 6401 TTAGTATCAA CATCAGCATC AACCGTGATT TTTAATAGAG AGAAGAGCTG 

96451 GTCATGGCTG AACATATGAC TTTCATAAAT GTTCCCTTCA ATATCAATCA 

9 65 01 GGTTGAGCTT CCCAGATACG ATCACTTTAT TTGAAGCACC TTTTGCTGTT 

96551 AGGCTGACGG GCTGCTTAAG ACCTAAGGAG TCAACATTGA TTCCTAGGTT 

9 6601 CGTGATTGTA ATACTCCCAG CTGTAGTTGA TAATGTCGTT CCTGAATCCA 

96651 TGCCGAGGAG AGAACCGGCC TCTTGAGAGA AGCTCGTGCT CTCTAAAGTG 

96701 ACTCCCTTTT GTAGCAATAA CTTTCCTCCG GATAGGGAGA CTGGCTGCGT 

96751 GAATGAAGAT TTTAAATTGT CAGCAACTTT AAGTTCATCT GCTGTTAGGG 

96801 TTTCTCCAGA AAATAGAATC GTTCCTTGAT ATGGATTGAG AGCTCCCGCA 

96851 GAGCCGTTAT TTATCTTCAA TACGTCTGAT GAGGTTCCTT CTGAAGTGAT 

96901 GGGATCATAG AAGAAAATTG TATGATTTTT AGCAGCCCGT AATTCCGTGA 

96951 ATTTCCCGTT ACTTCCTATG TTGATCGCAT TACGTTTAGG AGTATCGGTA 

97 0 01 CTTCCGGTTG TTGTAAGGGT ATTTCTTACA AAGGTAATGT TTCCTGTCTC 

97 051 TGCAGAAAGA CTGAGCTCTC CTGAGGCATC GATGCTGATA GCACCCCCCT 

97101 TAGGAGTTGC TGATGAGACA TTATTTCGTA GAAACTCTGT AAAGCCTCCA 
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97151 GAGGAAAGGG CTAGCTTTTT AGCATGGATG GCGCCACCGC TTGTTTCTGC 

97201 TACGTTTGAA GCAAAGATCA GAGTCTTATT GTTAGAGATT ATCAGTTCAG 

97251 GAGATCCACT CGCCTTGGTG TTGCAGATCG CACCGCCAGT AGTTTTCGCT 

973 01 GCATTCCCTT CAAAATATAG AAATTTGTTG TTCGATAGTA TCGACGTGCC 

973 51 TTCATCATCG ATAGCGCCTC CTGACGTAGA CGCTATGTTA GATAGGAATC 

974 01 TAACATAACC TGTGTTATTT GCTATGCGAG CGCCTGCTGT AGTAGCAATT 
97451 GCTCCTCCCT TTGTTGATGA AGAGTTGTTA CTAAAAAGAG CATCTCCAGA 
97501 AGTGCCAGTT AAAAGGAAAG ACGCTCCTTT GATAGCTCCA CCATCTGCAG 
97551 TAGAAAAATT CCCAGCAACT ACAAGTTTAC GAATATTTTC TAAATTTACG 
97 601 CCTCCTGCTG AGGAAAGCGT TCCCTGACCT GTAGTAACCG TTGTGCTAGG 
97651 AGAGGAATCA AAACTCAGTA AGGAAAACCC TGAGAAGGTA AGATTCTTAT 
977 01 TTGCTGTTGT AGATGCAGCA GCACCTGCAT GAGTGCCAGC ATCTATAAAG 

977 51 CCAAACGTTA AGCTATGACC GTTCCCCAAG AAGGTAAGAT TGTCCGTGGT 

978 01 TTGCTTAAAA CAACTGTCAG ATAAGGGAGT GCCTTTTCCA GGCTCGTAAA 
97851 AGAAGACATC TCCTGTTAGA GAATATGTTG TGGCTGAAGT TTTTGGAGTA 

979 01 AACGTTCCTG AATCGATATT TCCATTAAAG CTATCATCAG GTGATAAAAG 
97951 TTCCTCGTTA GCTAGTGACT GTAGGTGACA TGAGAAAGCT AACACGGAGG 
98001 AAACTAAAAC CCAAGGAATC GAAGTCTTCA TGGTAATGCT TTTGTTTTTT 
9 80 51 AGAGAACTAT TCGCATCAAT ATAGAAACAA AATAAGTAAA TCAAGTTAAA 
9 8101 GATGACAAAA CAGCTGTCAA GAATTTTTAT CTTGACTCTC TGAGTTTTCT 
98151 ATTTTATATG ACGCAAGTAA GAATTTAATA ATAAAGTGGG TTTATGAAAT 
98201 CGCAATTTTC CTGGTTAGTG CTCTCTTCGA CATTGGCATG TTTTACTAGT 
98251 TGTTCCACTG TTTTTGCTGC AACTGCTGAA AATATAGGCC CCTCTGATAG 
98301 CTTTGACGGA AGTACTAACA CAGGCACCTA TACTCCTAAA AATACGACTA 
98351 CTGGAATAGA CTATACTCTG ACAGGAGATA TAACTCTGCA AAACCTTGGG 
9 8401 GATTCGGCAG CTTTAACGAA GGGTTGTTTT TCTGACACTA CGGAATCTTT 
98451 AAGCTTTGCC GGTAAGGGGT ACTCACTTTC TTTTTTAAAT ATTAAGTCTA 
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98501 GTGCTGAAGG CGCAGCACTT TCTGTTACAA CTGATAAAAA TCTGTCGCTA 

98551 ACAGGATTTT CGAGTCTTAC TTTCTTAGCG GCCCCATCAT CGGTAATCAC 

98601 AACCCCCTCA GGAAAAGGTG CAGTTAAATG TGGAGGGGAT CTTACATTTG 

98651 ATAACAATGG AACTATTTTA TTTAAACAAG ATTACTGTGA GGAAAATGGC 

98701 GGAGCCATTT CTACCAAGAA TCTTTCTTTG AAAAACAGCA CGGGATCGAT 

98751 TTCTTTTGAA GGGAATAAAT CGAGCGCAAC AGGGAAAAAA GGTGGGGCTA 

98801 TTTGTGCTAC TGGTACTGTA GATATTACAA ATAATACGGC TCCTACCCTC 

98851 TTCTCGAACA ATATTGCTGA AGCTGCAGGT GGAGCTATAA ATAGCACAGG 

989 01 AAACTGTACA ATTACAGGGA ATACGTCTCT TGTATTTTCT GAAAATAGTG 

98951 TGACAGCGAC CGCAGGAAAT GGAGGAGCTC TTTCTGGAGA TGCCGATGTT 

99001 ACCATATCTG GGAATCAGAG TGTAACTTTC TCAGGAAACC AAGCTGTAGC 

99 051 TAATGGCGGA GCCATTTATG CTAAGAAGCT TACACTGGCT TCCGGGGGGG 

99101 GGGGGGTATC TCCTTTTCTA ACAATATAGT CCAAGGTACC ACTGCAGGTA 

99151 ATGGTGGAGC CATTTCTATA CTGGCAGCTG GAGAGTGTAG TCTTTCAGCA 

99201 GAAGCAGGGG ACATTACCTT CAATGGGAAT GCCATTGTTG CAACTACACC 

99251 ACAAACTACA AAAAGAAATT CTATTGACAT AGGATCTACT GCAAAGATCA 

99301 CGAATTTACG TGCAATATCT GGGCATAGCA TCTTTTTCTA CGATCCGATT 

993 51 ACTGCTAATA CGGCTGCGGA TTCTACAGAT ACTTTAAATC TCAATAAGGC 

99401 TGATGCAGGT AATAGTACAG ATTATAGTGG GTCGATTGTT TTTTCTGGTG 

99451 AAAAGCTCTC TGAAGATGAA GCAAAAGTTG CAGACAACCT CACTTCTACG 

99501 CTGAAGCAGC CTGTAACTCT AACTGCAGGA AATTTAGTAC TTAAACGTGG 

99551 TGTCACTCTC GATACGAAAG GCTTTACTCA GACCGCGGGT TCCTCTGTTA 

99601 TTATGGATGC GGGCACAACG TTAAAAGCAA GTACAGAGGA GGTCACTTTA 

99 651 ACAGGTCTTT CCATTCCTGT AGACTCTTTA GGCGAGGGTA AGAAAGTTGT 

997 01 AATTGCTGCT TQTGCAGCAA GTAAAAATGT AGCCCTTAGT GGTCCGATTC 

99751 TTCTTTTGGA TAACCAAGGG AATGCTTATG AAAATCACGA CTTAGGAAAA 

9 9 801 ACTCAAGACT TTTCATTTGT GCAGCTCTCT GCTCTGGGTA CTGCAACAAC 
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99851 TACAGATGTT CCAGCGGTTC CTACAGTAGC AACTCCTACG CACTATGGGT 

99901 ATCAAGGTAC TTGGGGAATG ACTTGGGTTG ATGATACCGC AAGCACTCCA 

99951 AAGACTAAGA CAGCGACATT AGCTTGGACC AATACAGGCT ACCTTCCGAA 

100001 TCCTGAGCGT CAAGGACCTT TAGTTCCTAA TAGCCTTTGG GGATCTTTTT 

100051 CAGACATCCA AGCGATTCAA GGTGTCATAG AGAGAAGTGC TTTGACTCTT 

100101 TGTTCAGATC GAGGCTTCTG GGCTGCGGGA GTCGCCAATT TCTTAGATAA 

100151 AGATAAGAAA GGGGAAAAAC GCAAATACCG TCATAAATCT GGTGGATATG 

100201 CTATCGGAGG TGCAGCGCAA ACTTGTTCTG AAAACTTAAT TAGCTTTGCC 

100251 TTTTGCCAAC TCTTTGGTAG CGATAAAGAT TTCTTAGTCG CTAAAAATCA 

100301 TACTGATACC TATGCAGGAG CCTTCTATAT CCAACACATT ACAGAATGTA 

100351 GTGGGTTCAT AGGTTGTCTC TTAGATAAAC TTCCTGGCTC TTGGAGTCAT 

1004 01 AAACCCCTCG TTTTAGAAGG GCAGCTCGCT TATAGCCACG TCAGTAATGA 

100451 TCTGAAGACA AAGTATACTG CGTATCCTGA GGTGAAAGGT TCTTGGGGGA 

100501 ATAATGCTTT TAACATGATG TTGGGAGCTT CTTCTCATTC TTATCCTGAA 

100551 TACCTGCATT GTTTTGATAC CTATGCTCCA TACATCAAAC TGAATCTGAC 

100601 CTATATACGT CAGGACAGCT TCTCGGAGAA AGGTACAGAA GGAAGATCTT 

100651 TTGATGACAG CAACCTCTTC AATTTATCTT TGCCTATAGG GGTGAAGTTT 

100701 GAGAAGTTCT CTGATTGTAA TGACTTTTCT TATGATCTGA CTTTATCCTA 

100751 TGTTCCTGAT CTTATCCGCA ATGATCCCAA ATGCACTACA GCACTTGTAA 

100801 TCAGCGGAGC CTCTTGGGAA ACTTATGCCA ATAACTTAGC ACGACAGGCC 

100851 TTGCAAGTGC GTGCAGGCAG TCACTACGCC TTCTCTCCTA TGTTTGAAGT 

100901 GCTCGGCCAG TTTGTCTTTG AAGTTCGTGG ATCCTCACGG ATTTATAATG 

100951 TAGATCTTGG 'GGGTAAGTTC CAATTCTAGG AGCGTCTCTC ATGTCTCAGA 

101001 AATTCTGAGA GAGATCGCAT TTAGGATTTT CTTAAACACG ACTCACCTTG 

101051 TTTTTGAACC AGGAGAGATC GGGGATTAAA AAGGCAAGAG GGCAGAGTTC 

101101 GTGAGGTCAC GTACTCTGCC TTTCTTGTTA CAAACACGTT TTAAAATTAA 

101151 GGAAATTTTT TAATAGAAAC CCGTTCTTTA AAATACGTTT CTTTAATTCT 
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1012 01 TATTGAATAA GATAATTCAC TATTTTTAGA TCCTAAATTT TAAGTGGTTT 
101251 TTGTTATGCT TCTTATAGAG AATAGCTGCA AAGATTAGAG TTGCAGAGAC 

1013 01 GGTACGTCTC TTTCTTTTTT AAGGGAAGGG GTGTTGTTAC ACCCATCCTA 
101351 AGATTTGTGA GATTCCCCTC AGGCAGTAAC TTTTACAATC GTACTTTATG 

1014 01 TTTTGATCTA GCTGTTTTCT TGTCTTTAAT TTATTCAACC ATCGAGAAGA 
101451 GAGATCCATG AGTGGAAATG TATTTTATTA GGATCATCTC TAAGGATGGA 
101501 AATGATGAGC CCATTCCAAC AACCTGAGCA ATGTCATTTT GATGTTGTGG 
101551 GAAGTTTCTT ACGTCCTGAA AGTCTTACAC GAGCACGCTC TGATTTTGAA 
101601 GAAGGAAGAA TTGTCTATGA GCAGATGCGA GTTGTCGAAG ATGCTGCTAT 
101651 TCGTAATCTC ATAAAAAAGC AAACAGAAGC AGGTCTTATC TTTTTTACTG 
1017 01 ATGGGGAATT CCGTAGGTAT AGTTGGGATT TCGACTTTAT GTGGGGATTC 

1017 51 CATGGCGTGG ATCGTCGCAG GGACTCTAAT GACCCTGAAA TTGGAGTGTA 

1018 01 TCTTAAAGAT AAAATCTCCG TATCAAAACA TCCGTTTATA GAACATTTCG 
101851 AGTTTGTCAA AACTTTTGAG AAGGGAAATG CAAAAGCAAA ACAAACGATT 

1019 01 CCTTCTCCAT CACAATTTTT CCATGAGATG ATTTTTGCTC CTAATCTGAA 
101951 AAATACTCGG AAGTTTTATC CTACGAATCA AGAGCTAATT GATGATATTG 
102 001 TCTTTTATTA TCGCCAAGTC ATCCAAGATC TTTATGCTGC AGGTTGTCGT 
102 051 AATTTGCAGT TGGACGATTG TGCTTGGTGT CGCCTCTTGG ATATACGAGC 
102101 GCCTTCTTGG TATGGTGTTG ATTCTCATGA CAGGTTGCAG GAAATTTTAG 
102151 AACAGTTTTT ATGGATCCAT AATTTAGTGA TGAAGGATAG ACCCGAGGAT 
1022 01 CTTTTTGTAA GTCTGCATGT CTGTCGTGGT GATTATCAGG CCGAGTTTTT 

1022 51 CTCTAGACGA GCTTATGATT CTATAGAGGA GCCTTTATTT GCTAAGACCG 

1023 01 ATGTGGATAG TTATCACTAT TATTGGGCTC TTGATGATAA GTATTCAGGA 
102 351 GGTGCTGAGC CTTTAGCTTA CGTCTCTGGA GAGAAACACG TCTGCTTGGG 
102401 ATTGATCTCC AGCAACCATT CTTGTATTGA AGATCGAGAT GCTGTGGTTT 
102451 CTCGTATTTA TGAAGCTGCG AGCTACATTC CCTTAGAGAG ACTTTCTTTG 
102 5 01 AGCCCGCAAT GTGGGTTTGC TTCTTGTGAG GG AG AC CAT A GAATGACTGA 
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102551 AGAAGAACAG TGGAAGAAGA TCGCGTTTGT GAAAGAGATT GCTAAAGAGA 

102 601 TCTGGGGATA AAGAATCCGG AGTTTTTATC GACTCTAAGA GTTTTCGGAT 
102651 CATAGAAAAC ATTTAAATAT TCAAGAGTCT TTGGCTATTG GATCATAGAC 
102701 AGTCTTAGTA TACTAAAAAG TCTTTGGATT CTAAGACGGG CAGAGTTCGT 
102751 GAGATCACGT ACTCTGCCCA TTCTTTCTTG TGATCTAGCG ACTTCTTTGA 
102801 ATCTTCGACC TCTTGTAATC TGGGATTTTT TCTAGTTCTT AGATTCCTCT 
102851 GATCTTTCGA CTTCTCCTCG TCTAAACAAG GCGCATTGTC TTTGAGAAGT 
102901 CCCTAGATAC ACTCAGGATC TCTTAGAATT TCTAAGGGAT CAGGAACGCT 
102951 TTTAGAACTG GAACTTACCT CCAAGATCTG CATTGTAGCT GCGTGAAGAT 

103 001 CCACGAATTT CCATAGATAG GTTACTTGTG ACCTCAAGAT TTGGAGAGAA 
103 051 GGCATAAAAG ATCCCTGCTC TTCCGATACC AGCTTGTCTT GAGAGATTCG 
103101 TTCCTGTAGT TTTCCACGAG GTATTGTTGA TTAGGAGAGC TGTCGTGCAG 
103151 TCAGGATTCT TACGATAGAC ATCGGCAACG TAGATGACAG TAGCTTCGTA 
103201 AGACGCACGC TCGTTTCTCG AGAATCTCTC GAAGGTAATT CCAATAGGCA 

1032 51 CAGAGACGTT AATTAAATCA CCGCTATCGA AAGATCGTAC CAAGGTAGTA 

1033 01 TTACGTTCTT TGAAGCTATC TTGGTGTATG TACGAAGCTT CTACTTTGAT 
1033 51 GAAAGGAAAA TACGCGTGGA AGAGACCCTC ATGGCTTAAA GCAGTGTGTG 
103401 GTAGGGAGCT CGCAAGTTCC AGAGCGCAAC CGTCATTATA CCACGAGCTC 
103451 TCTCCCTTTG GTGCTTGGGT GTAATAGGTT TTCATAGTAT TTTTACTATA 
103 501 GATATAGCTG ATCTGAGCAT CAAAGAGGAC AGGCTGCTCA CTTTCAGATC 
103 551 CAGGAAGGTA GCGTAACAAG CTTGGAGAAG ACAAGGTCGC TAGATGCTGG 
103 601 AGATGGAGAG AAGCTGCATA GGCAGAAGCT CTATTTTTAT TTATAAAGTG 
103 651 ATCTCTATCT TTCCCGAATA ATTGGCAGAA GGCTGCAGTG ATAAGATTAT 
1037 01 CAGAAGCTAA TGTTGTAGTC GCTCCTACAA CATAACCTGC ACTTATGTGG 
1037 51 CGAAAACCTT TATTTATCTT CGTGCTATCT TTATGGAAGA AGTTCGAGAT 
103 801 CCCTTCACAC CAGATGCCGC GAGTTTCTTG AGATTGGCGT ACTTTAGTGG 
103 851 CTACAAGCTG TTGTATGGAG CGCACATCAA CAAAGGATCC CCATAGCGTG 

77 



103901 TTAGCAACTA AGGTTCCACG ACGCTCAGGA TTCGGATTGT ATCCTGTTTT 

103951 TGTCCAGGTA AGAGTCGCTG CTTTGGATTT AGTCGCAGTA TCCTCTTGCC 

104001 AAGATAATGC CCAATTCCCT TGGTATCCCC AATGGATAGG ATTTTTTTCT 

104 051 AGGGGATCAG CAGCTAAGTC TGTGATGTGA ATATTCGCGG GGTCGTCAGC 

104101 AGTAAGAGTG AGACAAGAAA AGACTTGAGG GTTATTCCAA GAGACATCTT 

104151 CGTAGACATT TCCAGAAGGA TCTACAAGAG AGAGCGATCC AGATAAAGTG 

1042 01 ACTGTCTGAC TTGCTTGTGT TGCTTTTAGC GTAGCCTTCT TGGTCTCTTT 
104251 TAAGGAATCT ACATTGAGAA CAAGATTATT GATAGTGATC CC ATCAGCGG 

1043 01 TTTCTAATGT GGTCCCTGCA TCCATGAGGA GGGTAGAGCC CGGAGATTGC 

1043 51 GAAAAGGACT TAGCAACTAG AGTGACTCCT GATTTAAGAG AGAGTTGCCC 

1044 01 TCCCGCAAGA GTTAGAGGTT GCTGAATTGT AGATTTGAGA TTATCAGCTT 
1044 51 CTGCAGCTTC TGCTTCCGAG AGCTTCTCTC CAGAAAATAC GATGGTTCCT 
104501 TGATATGCAG GATTCCCTGC AAGGTCAGGA CCATTTAAGT TTAGAGCATC 
104551 TGAGAGAGCT GCAGTGATGC TAGTTGTTAT AGGATCATAG AAGTAGATAG 
104 601 TATTGCCTTG AGAGGCTCGC AGCTGTACAA TCTTAGCATT GGTGTTTCCG 
104 651 ATGTTAATAG AATTTCTGGT AGTGGTCTGA CTCGAAGAAG CTCCTTTGAC 
104701 TACTGTGTTT CCTTCAAAAG TGATGTCTCC ACCAAGAGCC GAAAGACTCA 
104751 AAGATCCAGA GTCAGCAATC GCAATTGCTC CTCCTAAGGG AGCTGCAGTA 
1048 01 TCTATAGCAG AGTTGTTTTT AAAAAGCGTA GGTCCTCCAG AAGAAAGAAC 
104851 TAGATTGTCA GTATAAATCG CCCCACCACT AGTAATTGCT GTATTTCCTA 
104901 TAAAGTTCAG TTCCCCGTTG TCTGATAGAG TTAAGACTGG TTTGGGGGCT 
104951 GATGTACTAC TACAGTAAAT GGCTCCCCCT GTAGCTGAGG TTGCGGTCAC 
105001 ACTATTGTTT ATAAAGCTAA TTGCTTTGTT GCTGCTAATA AAACTGCTAG 
105051 CTTCCGTGTA AATGGCTCCG CCATTGTTCG CCGCGGTATT TTCAGAAAAT 
105101 GATGCTGAGT TTAACGTATT GTTAATTGTA ATCCCTCCCG TGGAATAGAG 
105151 GGCACCCCCT TTTTGCGTTG CTTTGTTTTT GGCAAACGTT AGGTTGGGGT 
105201 TTAGCGATAG ACTGATAGAG CTGCCTTGGA GGGCGCCTCC ATTGTCATTA 
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105251 GAAAAGTTTT GGCCAAAGTA GCAACTATAG TTCGACTGAA TAGAACAAGC 

105301 TCCTGTGGAC TTGATGGCTC CTGTTCCTGT GGTAGCATTC GTGGTTTGTA 

105351 TTAGTGACAA ATAGGAGAAT CCTGAAAAGG AGAGAAGCTT ATTTGCAGCT 

1054 01 GTATTGGTAA AGGTACAGTT CGCTCCCGCA TCGATATTTT GTAGGAGAAA 

105451 TTGGTAGCCG TGGCCTTGGA AAGAAAGATT CCCAGTAGTT TCTTTAAAGC 

105501 AGGAAGCGGT TAGAGCTGTC GGAGATCCTG CATTGGTGAT TGAGACATCC 

105551 CCTGTTAGAT TATAGATAGT TCCATCTGCA TTTGTTGTTT GGGCTGGAGG 

105601 AGTGTAGGTT CCTGGTCCAG AGAAGCTATT GGTAGGTCCT AGATTGATTT 

105651 CAACAACAGC AGCAAACGCA GAGAAATTTA GTGACAAGGG AAGTGCTAAA 

1057 01 GATGACGAGA TTAAAAACCA ATGAAGAGAG GATTTCATGT AGAGGGCTAT 

105751 AGGTGGTTTA ACAAATTATT TCACCACATA CTGCAATAAA TTAAAGAAAG 

105801 CAAGAGGAAA GGAGAGACTA GTAAGTTAAG AATCTACAGG GTTTTTATAA 

105851 GAATTCCTCC CTAAAAGTTT AGGGAGGAAA GTAGGAACTA GAATGAGTAT 

105901 CTTAGCCCAC AATCTACATT GTAGATGTGT GCTGAGCCAC GAAGCTCATA 

105951 AGCAGCTTCC CCAGAGAGTT CTACATGAGG GGAGAGAGTC AGATGGCTTC 

106001 CAGCACTTGC TAAGAAGGCT TGTCGTGCGA GGTTTTTACA TAGCGAAGTC 

106051 CAAGAGGCTC CACTGACCAT TAGAGAAGTA CGCGAACGGG GATTTTTACG 

106101 ATACACATCA CCAATGTAGG CTAGAGAAAT CTCGAAATTA TTTTTTTCAT 

10 6151 CTTCGGAGAT TTTTTCTAAC CGAATGCCGA CAGGGATAGA GCAGTTCACT 

106201 AGGTCTCCAT CATCAAAAGC ACGGGCTTCA GCGCCACTCT CTTTAAAGTT 

10 6251 TTGTTGGCGG CTGTAGACTG CCTGGAACTT TAAGAAGGGG AAATATCCCT 

106301 GGAAGAACGG TGCTTCTTTA GGGAGATATA GAGCCAGAGA TCCTCCGAGC 

106351 TCTAGAGCCC CAGAGTTATT GGTCCAAGAG CCTTGAGCTT CAGGATAGGA 

106401 AGTATAGCGA GTATCCATAT CATTTTTAGT GTAGCTGTAG CTTAGCTGGG 

106451 CATTCAAAAT GAGAGGAATA TCTTTCAGCA TGTCGGTGAT ACTTCCAAAT 

106501 GAGGGCATGG GAAGTCCTCC TAGGAATGCT CGATGTTGCA GGTATAGCGA 

106551 CGCTAAATAG TTATGAGAGG TATTTTCAAC TATAAACAGG TCTTTATCTT 
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1066 01 TACCGAAGAG CTGGCAGAAA GCTACACTGA AGATATTTTC AGAAAAATCT 

106651 TCAGCACTTC CTCCAACAAT ATAGCCGTAG CTTTTATGTC GGAATGCTTG 

106701 GTTAGTTCCT GATTTATCCT TATGGAAGAA ATTCGCAGTT CCTGATGCCC 

106751 AGAGTCCTCG TTGCTGATAG ATACTATTCG CTTGAGATGT CATGATCTGC 

106801 TGTAGAGTGC GAATGTCAGT AAAGGATGCC CATAATGAAT CGGGAACTAC 

10 6851 GGAAGCTCTA CGCTCAGGAT TAGGGTTGTA GCCCGTAGTT ACCCAAGTCA 

106901 TAGTTCCTGA TTTTGCAGTT GATGTGTCTG CCCAAGTGGC TTCCCAATGT 

106951 CCCTGATACC CGTAATGAGG TTCTGGAGTT TGTACTGGAG AAGTGAGAAG 

107 001 CGCATCGATA TAAATATCGC TAGCAGCAGT AGCAGCAGTG AATACCACCA 

107 051 AAGGCTGCGT GAAGGCTTGG TTTATCGTAT GGCTTTCATA AAAATTGCCG 

107101 CTACTATCTT GGAAAACAAG AGGAGAGGTT AGAGTTATAG TTTTGTTGGC 

107151 TCCTGCTGTT TCAATGGACA CACTCTTATT TCCCTCTAAG GCAGAAAGAT 

1072 01 CAACGACAAG TTTGGTAAGA CTGATAGCTT CAGTATCTGC TTTGAGCTTT 
107251 GTTCCTGGTT GCATGAGGAG TGTAGAGCCT TCAGTCTGTG TGAAACCATT 

1073 01 GACATCTAAC TCGACATTTC CTTTGAGTGC TAAGGTTCCA GAGGCTAGAG 

1073 51 CCAATGGTTG CTTTAATATA GATGTGAAGT TATCAGCAGC TTTCGCTTCA 
107401 TCTGCAGAGA GCTTTTCCCC AGAAAATACA ATCGTTCCTG AATAATCTAA 

1074 51 AGGCGAGTTG CTATCCGGTT GGTTGATGGT CAGAACGTCT GAAGCTCCTG 
107501 TGGTGTTAGA TGCAATCGGA TCATAGAAAT AGATAGATTG GCCTTGGGCT 
107551 GCCCTTAAGT TCGTAATTTT TGCTGACGAT CCCAGGTAGA TAGCATTCCG 
107 601 TGTCGATGTT GGCGCGGAGG TTGAGGTTAG AGTGTTGCCA AGGAACGTGA 
107 6 51 TGTCTCCTTG ATTTGCAGAG AGACTTAAAG ATCCAGAGTC GGCAATTGCA 
1077 01 ATAGCGCCGC CCTTGCCTGC AGCTGTGTTC CCGCATCTAT TATTTGAAAA 
1077 51 TAGGGTAGGG CCAGCAGCGG AAAGATCTAG ACCATGGGCA CAGATTGCTC 
107801 CGCCTTGAGT TACTGAAGAG TTCTCGGCGA AGGTCAGACT TTTATTTCCA 
107 851 GAGATAGTAA GAGTAGGAGT CTCTCCTGTT TTTTCACAAT AAATGGCCCC 
1079 01 GCCCTTGCCT GCAGCATCTG TTGCAGTGTT TCCAGAGAAG AAAAGGGAGC 
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107951 TATTTTGAGT AATCGAGGAG CTGGCTTCAA AGCCCAGAGC CCCACCCCCA 

108001 GTTTCTCCTT TATTATTCAT AAAGACTAAC TGGCCGGTGT TTCCTGAAAT 

108051 ACTTGCAGCC GCAGAGCTAT AGATCGCTCC ACCTAATTTT TTTGCGCTAT 

108101 TACTAGTGAA GGTTATAGAA GAGGTATTCC CAGAAATAGA AAGAGTTTTT 

108151 GTGGTGATCG CTCCGCCATT GTTATTAGCT TCATTGGAGA CGTTTTGGCT 

108201 AAAGAGAATC GTTCCATTAT CGGTAAGATT TAAGGCTCCT GCAGAACTTA 

108251 AAGTACTTTT TCCTGAAGCA ACTGTAGTTC CAGGAGCTGC AATGAAGGAA 

108301 AGGTTAGAAA ATCCTGTGAA TGTTAGGGCT TTATCAGCAG TTGTGCTTGC 

108351 CGCAGCTCCT GCATTCGAAC CCGCATCTAC CGTGTTGAAT GAAAATGAGT 

10 8401 ATCCCTTTCC AGTAAATGTC AGATCACCCG TAGTTTCTGT AAAGCAGCAG 

108451 CCTGTTAATG CTGTGCCTTT CCCAGCATCG TTTATATAGA CATTTCCTGA 

10 8501 TAAGACATAG TTCGTTCCAT TGGCATCTGC TGTAGATTTT GGAGTAAATG 

108551 TAGAGCCGCC CGCTCCATCA AAGCTATCTG TAGGGGATAA AGAAGCATCT 

108601 GCTCCGTAAG TTGCAATGCT CAATAGAATG GGAGTGACAA GAGTCGAAGA 

108651 GATCAGGAGT TTGTGCAAGG GTATTTTCAT AGAAAGATGC TTGGGTTCAA 

108701 TTAATTAACA CGTTTTCGAT AATCTAGAAA CAAAACTTAG AGCCTAGGTT 

108751 TGTATTATAA TTTCGTGAAG AACTTCGTAC TTCAAAAGCG AATTGACCGA 

10 8801 AGATTTCCAT GTGGGGGTTC ACTTGGAAAT GGTTCGCAGC ACGAACAGAA 

108851 AAACCTTGTC GTGCGAGGTT GGTACCATAG GCCATCCAGT TAGCATCGCT 

108901 AGCTATTAGG GAAGTTTGAC ATTTAGGATT GCGTCGGTAA GCATCGAGTA 

108951 TATACATAAG AGTAAGATCG TAAGTTCCCT TTTCTGATTT TGAGTCTCTT 

109001 TCGAAGGTGA CGCCTATAGG AATCTCTACG TTGATAAGCT CGCTTTTATT 

109 051 GAAAGCGCGT .CCTTCAGCAT GACGCTCGTA GAAGTCTTGC TGATGCGCAT 

109101 AGATATACTG TACTTTGACA AAAGGTTCGA CTTCTTTCAG AAGATACGGA 

109151 ACGGAAATAA CAAAAGGCAG GCTAGCTCCA AGATCTGCAC AGAAGGCATC 

1092 01 GTTTCTCCAA GAACCCTTGA TGATAGAGTT ATCGGTATAA TATGTCTTCA 

1092 51 TGTGGTTGTC TGTATGGAGA TAACTGAATT TAGCATCGAA CGATAAAGGA 
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109301 


ATGATCTGGG 


AGATCTCAGA 


GAGCACCCAG 


GGAGCTCGGG 


TTGCTTTTCC 


109351 


CCAGAGGAAA 


TTGGCGATGT 


CGAAGAGCCC 


TTCTGTATGG 


TGGAAATACA 


109401 


AAGAGGCACC 


GTAAGTATCT 


CCGTGGTTCT 


TACCTGTAAT 


ATGATTGCGA 


109451 


TCTCTAGCAA 


AGAGCTGGCA 


GAAGGCAAAA 


GTAAGCTGAT 


CCTCGGCAGG 


109501 


AGTTGTTGCT 


GTGATCCCTA 


GTGCATAACC 


CCCGCTGATA 


TGGCGGAAAC 


109551 


CATGGCGGGT 


GGGCATAGAA 


TCTCTATAGA 


AGAAATTCGC 


AATTCCTGAA 


109601 


AGCCATAGCT 


CACGCTCAAA 


AGGCTCCCCA 


CTGGACTTGG 


TTTCTATAAG 


109651 


CTGATTGATC 


GAGCGTATAT 


CTATAAAGTT 


TCCCCATAAG 


CTATTTAGAG 


109701 


GGAGATTACT 


TTTTCTCTCA 


GGACTAGGAA 


TGTATCCTGT 


ACGGGTCCAG 


109751 


TTGATGCTTC 


CTATTTTTGA 


GGATGTTGCA 


TTTGCCCAAG 


ACAACTGCCA 


109801 


GTTTCCTTGA 


TACCCGTAGT 


GGGTTTCAGG 


TTCTTGAAGA 


GTCAGGGTAG 


109851 


AAAGAGCTCC 


CAGAGTAATC 


GTTCCGTTGG 


CTCCTGCGGT 


GGTAAGTTGA 


109901 


AGAAGAGGAT 


AGGTACTAGC 


ACTTTTTAAG 


TTATGATTCT 


CATAGAATGA 


109951 


CCCTTCCGTG 


TCAATAAGCG 


CAATCGTTCC 


CGATAGGCTG 


ATATTTTTAT 


110001 


CTGCAGCTTC 


TGTTTTTAAA 


GCTGCCTTGT 


TGGTTCCATC 


TAAAGAGGAG 


110051 


AGATTTACTG 


CTAAGCCATT 


AAGCGAAAGA 


TTTGCCTCTT 


TAGCACTAAG 


110101 


TGTAGTCCCC 


CCATCCATTA 


AGATGCGGGA 


TCCTGGACTT 


TGAGTGAGAT 


110151 


CCTTGAAAGT 


TACGGTGACT 


CCATCACGAA 


GTACAAGATC 


TCCCCGCGCT 


110201 


AATACTGCAG 


GTTGTCGGAT 


AGTAGAGGTG 


ACGTTTGCAG 


CGATTGCTTT 


110251 


TTCTGTAGGG 


GAAAGCTTTT 


CTCCAGAAAA 


GACAATCGCA 


CCCCCATACT 


110301 


CGATCTCACT 


GTTCGCATCT 


GCTAAGTTTA 


AGTTCAATGT 


GTCGGTAGAA 


110351 


GCTGCGGTTC 


CTGGATTTGT 


GATGGGATCA 


TAGAAATAGA 


TAGATTGCCP 


110401 


CGTAGCAGCT 


CGTATCGATG 


TGACTTTAGC 


GGTATCAATG 


ATATTTATTG 

■fi x -ti J- X inl X o 


110451 


CGTTTCTTGT 


ACTTGTGCTT 


CCGTTGGTGA 


CTTGGTTGTT 


ATTGAAGGTA 


110501 


ATATCTCCAG 


AAGTAGCAGA 


GAGAGCGAGT 


TCCCCAGCAG 


ATGCTATATT 


110551 


GATCGCTCCT 


CCTCCTCCCT 


GACCGGCGCT 


ACTTCCTGAG 


ATATTACTTT 


110601 


GAAATAGAGT 


AGGACCTCCA 


GCGGAAATAC 


TGACCTTGAG 


TCCAGAGATG 



110651 GCTCCGCCAT ATGTCAATGC TGTATTATTT GTGAAAGAGA GGTTTTTGTT 

110701 CCCAGTAAGA GTCACTGTTT TATCTGTCGT AGTGCAACAA ATAGCCCCGC 

1107 51 CCTGAGCTTG AGCGGCTTCC CAAGCACTAT TGCCGTCAAA GATCACTTGA 

110801 AAGTTATCTG TAATCGAACA GTTGTCAGTG CTGTACAGAG CACCGCCAGA 

110851 TCCTTTCGCT AGGTTTTGAG AGAAGGAAAC TATCCCAGGG CTGTTCTCGA 

1109 01 TAGTTATAGT TCCTGTAGCG TAAACTACAC CGCCTTGCTT CCCTGTGAAG 

110 951 GCTTGGTTTC TCGAAAAGCT CGCAAACTGA GATGTCCCTG ATAATAAGAA 

111001 GTTTTTCGTA TTGATAACAC CGCCGTTATC TGACGAGAAG TTCTGAGTAA 

111051 ATATAATTTG GGAATTGCCA GTTAGAGATA GATTCCCCAC AGATTTTAAA 

111101 GCACATTGTC CAGTAGGAGA GAGAAGAAGA GAGGGACAAG AGATAATAGA 

111151 GAGTCTAGAA AAATCATTAA AGAGAAGATT CTTATCTGCT GCTGAGGTAC 

111201 TGGCTACAGT TCCAGCGCTA GAGCCCGCAT TGATAAATGC AAACTTCAGT 

111251 GCATGTTGAT TTCCTTGGAA AGTAAGATCG CCGCCCGCTT CTAGGAAGCA 

1113 01 TCCTGAGGCT AAGGGAATTC CTAAAGCCCC TGCATTTTGA AAGGATACGT 
111351 CGGAAAGTAA GGAATAGGTA GTTCCTGCAG CAGCGTCCGT AGTGGAAAAG 

1114 01 ACCGTGAAGG TAGTTCCGTT AGATCCATCA TAGCTATTAT TGCTGCTATC 
1114 51 TAAGGTCACC TCTGCCGCGA CTATAGAGAG CGATGAAAAG AGCGGGATTG 
111501 AAGAAAAGAA CAACCAAGAG ACAGAGGACT TCATTTGTAA GCACTTTTTT 
111551 GAAACAAGGA AATTAAATTA GCAAATACTG TAAAGAAAAA AAGAAATCAA 
111601 GGGAAACGCA AGGAATTGAT TGATGCGGAG AATCAGAACC CCAAGGATGG 
111651 CGGATCTTTT ACTTCTCTTC ATACGGATCC TAAGAATCTC TTTGATGAAG 
1117 01 AGGGGATGCC CTCCCCCTCT GATACCCTAC AGTGCGATCT CAATAACGTA 
1117 51 TTCATCTTTA TAAAAAGTAT GTTTTTCTAA GATTCTCGGA GAATCTTAGA 
111801 AAGAATAACG AGTTCCACAG TTTGCATTAT AGCTTCTTGA GGAGCTGCGC 
111851 AGTTCACAAC TTCCAGAAGC GAAGCAGTCA AGACCATGAA GTAACTTCAG 
111901 ATGTCCAGAA GCCTCAGCAA AGAAAGCTTG TCGTGATAAG TTTGTAGCAA 
1119 51 ACGTAGACCA CGAGGTGCCA TTTGTTAAGG AGGTCAGGCA GTGAGGGTGA 
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112001 


TCCCGGTAAG 


CATCTACAGC 


GTAACCTAAA 


. GTAAGAAGCA 


AAGCACTGGG 


112051 


GGGCTTTGCT 


GATTCGTGTT 


TGAAGGTGAG 


TCCCATAGGG 


ATAGACACGT 


112101 


TGACCAGATG 


GCTAGCGTCA 


AAGATACGTG 


GATCAGCAGC 


AACCTCTTGG 


112151 


AATCCTTTTT 


GATTTACACT 


CACAACTTGG 


AGTTTCACAT 


AGGGAGAGTA 


112201 


GCTGGTAAGG 


TATCTGTAGT 


TTAGATCTAC 


AGGAAGAGAA 


CCACCGACTT 


112251 


CAACAGCGAA 


GCTATGGCTG 


TCCCAGTCTG 


ATTTCCCTTG 


TGTGTTGTTC 


112301 


GCAAGCTTTG 


TCGTCATATT 


ATGGTGGTTT 


CTTCCATAGG 


AAACTTGACC 


112351 


ATGGAGAACA 


AGGGGAGTTT 


CTCCTGGGAG 


CTCTGGAAGG 


ACCTTAGAGA 


112401 


GGACGTGGCG 


ACGTAATGAG 


CTATGCAGGG 


GAATGACATA 


AGAGCTCTGA 


112451 


GCACAGAGAG 


ATCCTGCATA 


GACTTGAGAT 


TTAATATCCG 


AGACTACGTA 


112501 


ATCCTTAGAT 


TTGCCAAAGA 


GTTGGCTGAA 


TGCAACAGCA 


AAGGTATATT 


112551 


CTTGAGGGGT 


GGTCATGCTG 


CCACCAACAA 


TATAACCTCT 


GGAAATCAAA 


112601 


CGGAATCCTG 


CATTTTCCTT 


TTGCTTGTCT 


TGATGGAAGG 


CGTTGCCAAT 


112651 


ACCTCCAATC 


CAAATCCCTG 


GATGTGAGGG 


AGCGTCCGAC 


ATCGCAGTGG 


112701 


CGATCTCCTG 


CTGTATAGAA 


TGGATGTTTA 


CATAAGCATT 


CCAAAGGCTA 


112751 


TTAGGAACTA 


AAGTCGCACG 


AAGCTCTGGT 


TTAGGAGTGT 


ATCCTAACGC 


112801 


TTGCCATTCC 


GCGACCAAAG 


TCACCTTCCC 


TCCAGCTCCT 


ACTTTAGGAA 


112851 


CCAGAGTCCA 


ACTCCCTTGA 


TACCCATAAT 


CCGGAGCAGC 


CATGCTAGAA 


112901 


GGAATCGGAT 


TGAAGTCGTC 


TAAATTTACA 


GTTCCTGAAG 


TAGAAGAAAG 


112951 


ATCTAAGAAA 


GGAAGATTTA 


AGTTTGCTTT 


CAACCCAGGA 


TTGTCATAGA 


113001 


AACTTCCTTC 


ATTGTTATGG 


AATTTCAGAT 


CCCCTGAGAT 


TTTTAATCCC 


113051 


CCACTTGTGC 


TGTTTACGGC 


AATCGTTATC 


ATACGCTTGC 


CATCTAAAGC 


113101 


ATCCAGATTT 


ACAGAGAGAT 


TCTTTAGATC 


GATGCTGCCA 


TCTGTATTGT 


113151 


TAGTTGTCGT 


GGTCTCTAAG 


GTCGTTCCTG 


CATCCATGAA 


TACTGTAGAA 


113201 


TCAGGCTGCT 


GTGTGAAGGA 


ATATACTTGT 


AGGGTGGCTC 


CTTCTTTTAA 


113251 


AACGACATTT 


CCTCCTGCTA 


AGTTGATCTT 


CTGGTTCAGT 


ATGGTGGTAG 


113301 


TATTTGCAGG 


AATCGAGGCA 


TCTTGACTGG 


GGAGTTTTCC 


AGAAGAAAAT 



1133 51 ACTATAGTTC CCGTGTTTGG GTTTGCAGGT GCTACAGGGA CTACAGGCAC 

1134 01 TGAAGCTATA GGACCATTTT TTGGTTGGGG AGGAGGAACA ATAGCTTTGA 
113451 CAACAGGATT GATGACTAAC TCCTCTATTG TTCCTCCAGA TGCAGGAGCT 
113 501 TCCATCGTAA TAGGATCATA AAAATAAATC GTATGACCAG GAGCTGCTGC 
113551 AAGCTTAGTG ATCTTAGCCC CTGCACCTAA ATGGATCGAG TTGGGAGTTG 
113 601 AAGTTCCCTC AGTCGCTCGG TTCCCTGAGA AAGTAATATC CCCATCAATA 1 
113 651 GCCTCTAAGG AAAGTTCTCC GCTATCGGCT ATATAAATGG CGCCTCCCTT 
113701 GCCTCCAGAA TTATTGGTAA AGGAGACAGG ACCGTTAGCT GTAATCGAAA 
113751 GGTTTTTCGA ATAAATCGCT CCTCCCGAAG TTTCAGCAGT ATTGCCATCA 
113 801 AAGTTTATGG ATTCACTGCC TGAGATTACA CACTTAGGAG CATAAATACC 

113 851 ACCACCACTT CTTTTTGCCG TATTGTTAAT GAAACTTAAA CTCTCATTTT 
113901 CAGTAAGAGT TAAGCTTTTT GTAGCTATGT CAGACTCTGA GATATTACAG 
113951 AGGATCGCTC CACCACAACC TTCTTGATCT GTAGTTGTTG TTGCTGTTGC 

114 001 TGTTGCTGAA TTTCCAGAAA ATACAAGAGC CTTATTTTTG GTAAAGGAAG 
114051 TATTTCCTTT AGTATGTAGA GCCCCTGCTG TCTTTGCTGT ATTTGTGCTG 
114101 AAGGTCACGG TTCCCGTACT TCCCGTAAGA GTAAAATCTT CGGTTTCTGT 
114151 ATAGATCGCT CCTCCTGCAG TTTCGGCAGT ATTGCCATCA AAGGTAAGAG 
114201 TCGTGTTTCC ATGCAGAGCA CACTTGGTCG CATAGATCGC ACCGCCACTT 
114251 ACTGTTGCAG TATTACCAGA GAGACTCACG TTTTCGTTAT CTTCAATCCA 
1143 01 GAGTCCTTTT TTAGTACTTA CAGATGCTGA CTCAAGAAAC GATAGGATTG 
114351 CCCCACCGCA ACCCTCTTGA TTTGCTGAAG AATTACTCGG GCCCGTAGCT 
114401 TTGTTCCCTG AAAAGAGCAG GTTGGTATTA CCAGACAGAG AGTTGTTGCC 
114451 TTTAGAATAT AAGGCGCCGC CTGTCTTTGC TGTATTTGTG CTGAAGGTCA 
114501 CGGTTCCTGT ACTTCCTGTA AGAGTAAAAT CTTCAGTTTC TGTATAGATC 
114551 GCCCCTCCTG AA.GTTCCAGC AGTATTGCCG TCAAAGGTCA GGGAGCCGTT 
114601 TCCAGTTAGA GTACATTTGG TAGCATAGAT CGCACCACCA CTTACTGTTG 
114651 CAGCATTACT AGTGAGGCTG ACTTCTTGGT TGTTTGCAAT CGATAGTCCT 
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1147 01 GTTTTATCGC TTACGGATCC TGAATCAATA AAGGCTAGGA TTGCCCCACC 
1147 51 GCAACCCTCT TGATTTGCTG AAGAATTACT CGGGCCCGTA GCTTTGTTCC 
114801 CTGAAAAGAG CAGGTTGGTA TTTCCAGTCA GCGAGCTGTT TCCTTTAGAA 
114851 TATAAGGCGC CGCCTGTCTT TGCTGTATTT GTGCTGAAGG TCACGGTTCC 
114901 CGTACTTCCC TTAAGAGAAA AATCTTCAGT TTCTGTATAG ATAGCTCCGC 
114951 CACATCCTGC TGTCGCAGTA TTCTGATCGA AGGTAAGAGT TGTGTTTCCA 
115001 TCCAGAGTAC ATTTAGTAGC GTAGATCGCT CCACCATTCG CAGTTGTTGT 
115051 ATTACTAGTG AAGCTCATTT CTTGATTCTG AGAAATGGCT AATCCAGTTT 
115101 TGTCTGTTGC TGTAGCAAGA TAACAACAGA TTGCCCCACC ACAACCTTCC 
115151 GGGTTATTTG CCTGTGCTGC TGAGCCGGTT GTTTTATTTT CCTGAAAAAG 
115201 TACTTGAGTG TTGCCGGTAA GAGCAAGATT GTCATCAGAG CTCCAAGCAC 
115251 CCCCCGTCTT TGCAGTATTA GATTTGAAGG TAACGACTCC TGTATTGGCA 
115301 TCTAGCGTGC TATCCTTTTC TTTTGAGTAG ATCCCCCCAC CTTTATCTGT 
115351 AGCAGTATTT GAGGAGAAGG TCACCGTTCC TGAGTTTCCT TGGACTGTAG 
115401 TGTTTGCTGT ACTACAGAGG GCCCCGCCAT TTTTTGTGCT AGTATTTTGA 
115451 TCTAAGAGAG CTGCTGTCGT AGTCTTAGCA AGATCGATGC TGTAGGCAGA 
115501 AACTGCAGCT CCATCTTTTT CTGAAGTATT TTTTTGGAGG GTGACACTGG 
115551 CATTGTCAGT AAAAGTCGCA GTACCTCCCT CTGTATTTGT CACACAAATA 
115601 GCACCCTTGC CGCCCGAAGT TCCTGTTGCT GGAGCTGAGT CGATTAAGAG 
115 651 TGACGAGAAT CCTGAGAAAG AAAGAGCTGT GTTGGTATTG TTAATTGCAG 
115701 CACCATCATG CGTAAGCGCT ATGGTTTGCA GAACCAATGA GTGATCAGCT 
115751 CCAACAAAAC TCAATGCTCC TCCTGTGTTT GTAAAACAGC TTTTATCTGC 
115801 AGGAGTAATT 'GCAGATACAT TCGTAATAGA AACATCGCTA GTGAGAGTGT 
115851 AGGTAGTTCC TGAAGCATCC GAAGTTTCCT TGGCAGTGAA TGCTGCGCTA 
115901 CCACTACTAC CATTTTCATA GTTATCGGAT GATGAGAGAT CCGTGTTAGC 
115951 AGCCATTAGT GGATGTAGGG AGAAAACTAA AGCCGAAGAG GTAAGTAGCC 
116001 AAGGTAAAGA ATATTTCATG TGTCTTTGGG GAAAAGCTTT TTATCAAAAA 
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116051 


TACTCCCATA 


GCATGTGGCT 


TTAGGAGCAT 


GGTGCACCAA 


TAGAGAATAC 


116101 


AGTTAAATAA 


ATCAAGTAAA 


TGCTCTGGAG 


AAGACTCTCA 


GTTATAGAAG 


116151 


TTTCAATCTT 


GGGAGAGAAG 


CATTTAAGGT 


ATTTTTCTAT 


ATTTAAGAGT 


116201 


CCCTTAAAAC 


ATAAGGGAAA 


TGCTTAAGGG 


TAGGGGAGAA 


GGTGTACAAG 


116251 


CGGTTTTGCT 


TTTAGACCTT 


CTTGAATTTT 


AGAAGGAGAG 


AGTAAGGAAG 


116301 


ATGGTTATCT 


AAACCACGGA 


TCCTTATTTG 


TTCTTTCTGT 


GTTTCCGTTT 


116351 


TTCTCTTTCG 


TCATAACCGT 


CAGAGAATGG 


ATTGGAGGGG 


CTGAGGTTAG 


116401 


CCGTGCTTTT 


TCCGTCTGGA 


GGCGGAGTTT 


TAAGAGATTG 


ATTCGATTTC 


116451 


CCGCTTTTCT 


TCTTAGATTG 


CTGTTTCTGT 


TCTTCATCTT 


GATTTTGCTG 


116501 


TTGCTGCTGT 


TTATCTTGGC 


GATCACGAGG 


GGAACGGCGG 


GAAGATGAAG 


116551 


AAACTCGAGG 


AATTGAGGGT 


TCTTTTCCTC 


CCTTAGGGTA 


GACCGGCTOA 


116601 


GGGTGTAGAA 


CCGTCCCCGT 


GGAGAAATTA 


GGAGAGCGGG 


AACTTGCAGQ 


116651 


CATTATGGGT 


GTAAACGCAC 


TGCTCGCTCC 


ACTACCGAAT 


GAACTGOTTC 


116701 


TTAAATCTTT 


GAAGTGATAA 


GGCTGAAAAT 


CGTCCTTAAA 


TGGAGGACTT 


116751 


ATAGATGCGG 


GTCGTGTAGA 


AGCCGCATCT 


GAGGGTTTTG 


TTTTAAAACG 


116801 


TGAGAGGCTG 


CCGAAGGAGC 


GTCTATGCTC 


AGGATTTCGT 


GACGAATAGA 


116851 


AGAAAGATCC 


TTGAGCCTCC 


CGACCGATCC 


TACGATGACG 


GGCATCTTGG 


116901 


CGTTGTGAAG 


CTTCGTGTTC 


CTCCATGCGG 


GCAGATGCAG 


AAAGATCTTT 


116951 


TTTCTTTTCC 


ACTGCGATTT 


TTTTTGCATC 


GTCGGCGGAT 


GGAGTCACTA 


117001 


GAATAGGCTC 


CGCGGTCTCT 


TGGGTTTTCT 


TTTCCATCCT 


CAAGAGGTGT 


117051 


TCCCTACGGA 


AATAATCTAA 


GGTGAGCTTA 


TTCAAGGACA 


TGAAGTATAG 


117101 


CCCCACCGTA 


ATCAATCCCA 


TGACAAACAT 


AGGGTTGGCA 


AAGACTAACA 


117151 


TCGTCCCACT 


AGAGGCAACG 


AAAGCCCCTG 


CAATAAGACC 


CGCAGCAATC 


117201 


GCTAATACAA 


TGATAGGAAC 


GACGATAGCA 


GTGATTGCCT 


CACCGATTTT 


117251 


CTTGGCCTTC 


GGACTGTCCA 


GAATATCAGA 


AATAAGTAGA 


GTAACTCCCA 


117301 


AAGCTCCCAG 


GGCAAGTGCC 


GGAGCGAGAG 


CATAAAGCAT 


AAGGCTGTTT 


117351 


CCTGAAGCAG 


TCAGGAGAAT 


CGAAAGGATC 


GCAATTGCCG 


CAAGGGCAAT 



117401 AATGCCTGTG TCATAGACAT AGCGCATTTT AGGATATTTA TCAGGAATAG 

1174 51 TAATAAATCT TTTGAATAAC CCTGTGGATG ACGTTTTTAA ACGTTTTACC 

117501 ACGCTTGGCT GCCCTGATGG GGATGCTGCT GCAGGAACTT GAGGTTGACC 

117 551 TAAAGGGTTT ATAGGGGGTT GGCTCATACT ATCAACTTAC TGTAATTATC 

117601 ATTAGGCCCA TGAATTTTCA TTCATAGGAT ATATTTCATA CTATTATAAG 

117651 ATTTAATAGG ATTTAGTTAG TTCTCTTTTC TTCTGAGTCT TAACTTTTTT 

117701 ATTAAATAAA GTTTATTTGT TAAAATCTTA ACAGATTTTT AACTAAAACT 

117751 TTAAGTTATT TTTATTTGGA ACTTTTAGTC GAAATAAGAC TCGCTTATGA 

117 801 GAGGGACATA CTCATCAGCA AATGGAGGGG GCGTGTGAGG TCGTGAGGGT 

117 851 GGAGGCTCAT ACGGGGGGAA AAGATTTTTA TGCACTTGGA AGAGGGTAAC 

117901 GCTAGTAGTT ACAGACATGA GAAGCCCCAA GGAGATAAAG ACAACTGGAG 

117951 GGGGGATAAA GACCAGACTA AAGACCAGAC CGATAATCAC AGCAATCGTA 

118001 AGAATGTGGA GGAGCCAAGC CATAGCATAC GAAGCAAGAT GTTGGCAGCT 

118051 TTTTGTTTTT ATAGCGTTAA AGAGAGCCCG TAGGGGTAGG GTCAAGGCAG 

118101 CATATGTAGA AGCGCAAGGG ATCAGGATAA CCTTAAGAGC TCCTAAGACC 

118151 GAGGATACCA GTACTTCTAT GGTAAGAGCA GTTCTGGGAT AGCTCTTCGC 

118201 ACAGTTTTTT AATTTTCGAG CTAGTATTCG TTCCGGAAAA ATGCCATTCA 

118251 GGTATAGCTG AGAGCCTTGT TTGCAGATAT TTTTGAATCC CATATCCGTT 

118301 TGAAAGAGAA TATTTTATGA AAAATTATGT AAAAATTCTA AGAGGATAGT 

118351 GGTTTTTAGA CAATCGAAAT TCCTGAAAAG GCAGGAAAAT GAGAGACACA 

118401 AGTAGACAAA ATCTCCTGAA GTTTTTGTAT GGGCCTGTAA AAAAATCTTT 

118451 CTGGAAACTG GAAATTAGAA GTTCATTACA GCGGAACCAC CGAAAAATGC 

118501 GGAGGTTTTA AAAGTATGGG ACGTTTTATT ATTGTTGTGA TCATCAGCTA 

1185 51 TCGAGATATC ATTCCCAATA GACCAACCGT AAAATCCCTT AATAAAACTT 

118601 CCCGGCCAAG GGGTGAGCTT CACGTTAGCT TCGATTTCAC GTCCTTGATA 

118651 TTCAAAGATG CCGCGAGAAG AAATTAGGTG ATTTTTGTGA AGTTTTTTTC 

118701 GGAATCTTGT AAGGCGGTAA GCAAGCCCTA AGTTACAGAC AGATGTCGAG 
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118751 CGGTAATCAA TAGAAAAATT CACAGGATAG ATGCAATTGA GAGTTAGTTG 

118801 GTCGGTAGCC TTGTAACTAA CACCTACTAA AGGCCAAGCC TTCTCTTGAT 

118851 GGAGGCCTGT TTCATTAATG ACGCCAAAAA TAGCAGAAAG CTTCTCAGTG 

1189 01 GCCTGGTATT TTCCAGAAAG AACTCCTTGA TAGAGTCCAT AACCCATCTC 

118951 AATATTTTTA GGATCCACAA GCCCAGAAAG AATGATAGAC CACTGCCAAT 

119001 TTTTTAAGGA GAGTGTATAA GCTCCTAAAG AGAGGAGAAC ATAGTTATAA 

119051 AAAGAAGTAT CTTGGAAAGT CGCCCACCCA AGTCCATTAG GATCTGTCTC 

119101 CGAAATAGGA AGTGAGCTTT TCCATTGAAT ATCCGCACCT ATATAGCCAG 

119151 TAGAAAACAG TAGCCCAGAA TGCTCTGTAA TCGGAAGTGT GCAGAGAAAC 

119201 GTTCCATCGT ATTGACGATA GCCTATAGTT TGATGAGGCA GCTTTTTAAA 

119251 TTTAGCATCG TTCACCTTTA GGTATTGTAC CTGAGCAGAG AAAGGACGTG 

119301 GAGGAGGATT TTTACATGCT TCTTCATCAA TTCCACAAGC ATCTTGAACA 

119351 ATAAAAATAG GAGTCGAGAG TACGTGTCCC GCAAATGCAG CGATGTGGAA 

119401 GAGCAGTTTG AACATGTTCT GTAAGATTCT CCAACGTTAC TAGAGATTGA 

119451 AATGGAATAT ACGTAATTTT TAAATTACTA TCTAATAATT TTCCTACTCA 

119501 GGAGGGACTT TCAAGAAGAT TTCCCTAGAT TTAGGGGGAT GAAAGGACTA 

119551 GAATTTTTCT AATCAGCAAT CGAACAAAAT TACAGGGTTC TTGGAGAGGT 

119601 GCTTAATACT ACGCATAAAT GATTTCTGAG ATGATTTCCG TTTCGGCTGT 

1196 51 TTCTGTGCTA ACTTGTTTTG GCACTCAGTA ATTTTAAGCT CAAGATTTTG 

1197 01 GATCGATTCC TTTTTCGCTT CAATTTCATT TAAAATTCTT GTTTTTTCTT 
1197 51 GATTTTGTGT TGTCAGCTCT GGAAGATGCA CGCTTTTAAT TGTAGAGAAC 
119801 ATCAACCCGT ATCTCTGAAC AAATCCTGAT CCTATAGAGG AAATGTCTTG 
119851 AGATATCTCA TCAATCAGTT TTGTTCCTGT CTGCTTATTT TTAAGAGGGA 
1199 01 TTGCTAGAAG AAGAGCAAGA ATCAGTGTGA GTACGATAAT TCCCAAGCCG 
119951 ATGCCACAGA TGATCCAGTT TCCAGTATAT CCTGCGTAAC CTGCTGAGAT 
120001 CGTTCCCCCT AAGGTTAACA GTAGTGTTAA GGCAAAACAG AGCGTATGCT 
12 0051 TTGTGGAAAG TTTGGAGCGA AAGGTCTCCC AAGGAGCCGC TGGAATCGGA 
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120101 TGAGGCTGCG TAATATAAGC ATTAGGAAGA TTGAGAACTT CTGTAGCATG 

120151 AGAGAGAGGG CTGCTCTCGG GGACTGGTGA TGGGGCTACG GAAGTTGCCA 

12 02 01 TGTTGTTTCC TCGAATCGTT GCTAACTAGT TTTGATTTGT CTTTTCATTC 

120251 TTGTGAGAAA GCTCAAAACG TTTTTGATAA AGAGTTGATT GCAGTTTGAA 

12 03 01 CTCTTTTTGT GAGAGGTCGC AGAGTTCTTC GTATAACTTT AGCGTATCTT 

12 0351 GAGTGGTTTT TCTCTGTGCA TTGGTTACAA GCTGTAGAGA GGATTCAGGT 

1204 01 CGAATCTTCT CCCTGATAAA TTGTACTGTC GTATGGAGCT GCTGAGGGAG 

120451 CTGCCGGATG AATTTAATAA ATCCTACCAA GGCTTGTAGG CAGAGAAGAG 

12 0501 TCAAAATAGT AAGAACAATG CCAACGGCAA TCAACAGAAT GCTTTGGCTA 

12 0551 TAGCAACCCA AACAGATAAT AGCGATTCCA GCAAGAGCTA CAATCACTAA 

120601 GATCGTAATC GCAGCAATAT GCATAGGAAT GGAGTGGGTG AGAAAAACAT 

120651 TGCCCTTCTC CCCCAAAGCT ACGATCTCCT TATTCGTAAT GAATAAATCA 

120701 GCAGACTCTT CCGGAAGGGA TGAGGGAAAT ACCCCGTTTA AAGTACTAGA 

12 07 51 CACAAAGAGA ACTCTATTAT TTGAGGAAAT AATTTAAGAA AAATGGTATT 

12 0801 TTTAGTCAAT TAGTAAGCGA GTCATGCCTC TTAGTTATTC AAATTTTTAA 

12 0851 AACCTTACCC TTCCTATGAG GAGACAAGTA AGAGAAATTA TGCAACAAAC 

12 0901 TGTAATTGTA GCAATGTCAG GAGGCGTGGA TTCTTCTGTC GTTGCCTATT 

12 0951 TATTCAAAAA ATTTACCAAT TATAAGGTTA TTGGCATCTT CATGAAGAAT 

121001 TGGGAAGAGG ATCGCGACGG CGGTCTCAGC TCGACTACTA AAGATTATGA 

121051 TGATGTCGAG AGGGTCTGTC TTCAGCTCGA TATACAGTAT TACACCGTAT 

121101 CTTTTGCTAA AGAATATAGA GAAAGAGTGT TCGCTCGTTT CCTCAAGGAA 

121151 TACTCTTTAG GCTACACTCC TAACCCCGAC ATTCTTTGTA ACCGAGAAAT 

1212 01 CAAATTTGAC CTTCTACAAA AGAAAGTCCA GGAACTTGGC GGAGATTACC 
121251 TCGCTACAGG GCACTACTGC CGATTAAATA CCGAGCTCCA AGAAACCCAA 

1213 01 CTCCTTAGAG GTTGCGATCC TCAAAAAGAT CAGAGCTATT TTTTATCAGG 

1213 51 AACTCCTAAA AGTGCTCTTC ACAATGTGCT CTTTCCTCTT GGGGAAATGA 

1214 01 ATAAGACTGA AGTTCGTGCG ATTGCAGCTC AAGCAGCTCT TCCCACAGCA 
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121451 GAAAAAAAAG ATAGTACAGG CATTTGCTTT ATAGGGAAGC GCCCTTTTAA 

121501 AGAGTTCCTA GAGAAGTTTC TTCCCAATAA AACAGGCAAC GTTATCGATT 

1215 51 GGGATACCAA GGAAATTGTA GGGCAACATC AGGGAGCTCA CTATTATACT 

121601 ATAGGGCAGC GGCGAGGACT TGATCTTGGA GGATCCGAGA AACCCTGTTA 

121651 TGTTGTGGGA AAAAATATAG AGGAAAATAG CATTTATATT GTGAGGGGGG 

1217 01 AAGACCATCC CCAGCTCTAC CTACGGGAAT TAACAGCTAG AGAGCTCAAT 

121751 TGGTTTACCC CTCCTAAATC CGGATGTCAC TGTAGCGCTA AAGTCCGCTA 

121801 CCGTTCTCCT GATGAAGCTT GCACGATAGA TTATAGCTCA GGTGACGAGG 

121851 TCAAGGTGCG ATTTTCACAA CCCGTCAAGG CGGTAACTCC AGGACAAACA 

121901 ATAGCGTTTT ATCAAGGAGA TACCTGCCTT GGTAGTGGAG TTATCGACGT 

121951 TCCTATGATT CCAAGTGAGG GCTAGGGAGA GCAGCTTCCT GCTCCTCTTC 

122 001 TTCCCTTTCA AAGGCAACGC GATTTTCAAC CAAGGTTGCT CGTAGCTTGC 

122 051 GAGCTTCTTG ACGGCAGGAC TCTTTAAGCA AGAGCTCCGC TAGAGGATCT 

122101 TCAAGGTACT GCTCAATGAC ACGGCGTAGA GGACGTGCTC CCATTTCTGG 

122151 AGAATGCCCC TTCGTTACTA GGAAGGAAAT CACAGAGTCT GGGATGTTCA 

122201 AAGCCATTTG GTAGTTTTTC AGTCTCGAGT CCAGTTTGTT GATCTCTAAA 

1222 51 TGGATGATCT CCGATAGAGA TTCTTTCTCG AGGGGACGGA AAATCACACT 

1223 01 TTCATCCAAA CGGTTAATGA ACTCAGGCTT TAAGTGTTTC TTCATAGCAT 

1223 51 GTTCGATTTT CTCTTGGATG ACCTTATAGT CCATATGGGA CTTCAAGCCA 

1224 01 AAACCAATTT CTCCGCTTTT ACGAATGAGA TCAGCTCCCA AATTGGAGGT 
1224 51 CATGATAATA ATGGCATGAC GGAAATCCAC TTTGCGACCA AAAGAATCAG 
122 5 01 TAAGACGTCC TTGCTCTAAA ATTTGCAACA TCAGGTCCAT AATGTCTGGG 
122 551 TGTGCCTTTT CTATCTCATC AAAGAGAACA ACGCAGTAAG GACGGCGACG 
122 601 TACCTGTTCC GTAAGGTGGC CCCCTTCTTC ATGACCTACA TATCCTGGAG 
122651 GTGATCCCAT CATCTTGGTA GCAGCAAATT TCTCCATGTA CTCTGACATG 
1227 01 TCTACCTGAA TCAGAGCGTC TTCACCACCG AACATCTCTA TAGCAATTTG 
1227 51 TTGGGCGAGC AGGCTTTTCC CTACACCGGT AGGCCCAAGG AATAGGAAGG 
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122801 


AGCCCGTAGG 


TCGGTTAGGA 


TCTTTGATCC 


CTGTTCGAGA 


. ACGTCGGATG 


122851 


GCACGGCAAA 


TGCTGGTAAC 


GGCATCATTT 


TGACCAATGA 


CTTTTCTTCT 


122901 


TAACGTGTCT 


TCTAACTTCA 


GAAGCTTCTC 


ACTTTCAGCT 


TCTGTGAGCC 


122951 


TTGCTGAGGG 


AATTCCTGTT 


TG TAG AG AAA 


CTACCTGAGC 


GACTGCTTCT 


123001 


TCATCTACAG 


GAACTTGGTG 


CTCTTCTTTA 


TGATTTTCCC 


ATTCCTGTTT 


123051 


CATACTTTGC 


AGACGTTCGC 


GAAGTTTTTT 


CTCTTCATCA 


CGTAAACCTG 


123101 


CAGCTTTTTC 


GTATTCTTGA 


GTTCCAATGG 


CCTGCTCTTT 


GGCCAATTTT 


123151 


GTATTTTCGA 


TTTCAGCCTC 


TAGCTTCATT 


AAATCTGTAG 


GCTGACCCAT 


123201 


TGTATTCACA 


CGGACACGAG 


CCCCAGCTTC 


ATCTAAAAGA 


TCTATTGCTT 


123251 


TATCAGGGAG 


GAAACGTCCA 


TGAACATATT 


GATCAGAAAG 


AGTCGCAGCT 


123301 


GCTTTTAAAG 


CTTCTTCAGT 


AATGAAGACA 


TTGTGATGTT 


CTTCATACTT 


123351 


TTTCTTGAGG 


CCACGTAAAA 


TCTCAATAGT 


CTCATCTACA 


CTAGGAGGGT 


123401 


GAACCACGAT 


TTTTTGGAAA 


CGACGTTCTA 


AAGCTGCGTC 


TTTTTCTATG 


123451 


TGCTTGCGAT 


ACTCATCTAT 


CGTAGTTGCT 


CCAATACACT 


GAATTTCACC 


123501 


TCGCGCTAAC 


GCAGGTTTTA 


AAATGTTTGA 


AGCATCGATA 


GCACCTTCAG 


123551 


CTGCTCCTGC 


TCCTACAATC 


GTGTGGAGCT 


CGTCAATGAA 


GAGCAAGATG 


123601 


TTTCCATGCT 


TGCGAACTTC 


ATCCATGACA 


GCTTTGATCC 


GTTCCTCAAA 


123651 


TTGCCCTCGA 


TATTTTGTTC 


CAGCAATCAT 


TAATGCTAGA 


TCTAGAGTAA 


123701 


TCAGTCGCTT 


TTTCCGTAAG 


GCATCAGGAA 


CCTCATTCAG 


AATGATTTTT 


123751 


TGAGCCAGAC 


CCTCAACAAT 


TGCAGTCTTA 


CCAACTCCAG 


CTTCTCCAAT 


123801 


AAGTACAGGA 


TTGTTTTTTC 


TTCTTCGGCA 


AAGAATCAAA 


ATCAACCGTT 


123851 


CGACTTCTGA 


AGAACGACCA 


ATGACAGGAT 


CGAGCTTAGA 


CTCTCGGACC 


123901 


ATCTCCGTTA 'AATCATAACC 


ATATGCTTTC 


AGAGCAGAAA 


GCTTTTCGTT 


123 951 


TTTGTCAGAA 


CCTAAGCTAT 


GACCTAAAGG 


AGATTTTGAA 


GATGAAGGGT 


124001 


TGCTTCGAGA 


GGATGAGGAA 


GAAGACGACG 


ACGAAGGAGG 


AAGTTGTAGA 


124051 


TTGAAGGTCT 


CTAATTCTCT 


AAGAATfTCC 


TTACGAACCT 


CTCTTGGATC 


124101 


GATATGTAAG 


TTTTCTAATA 


CCTGAAGAGC 


GACACTATCT 


GATTGATGTA 



124151 GGATCCCTAA GAGTAAATGC 
124201 CTGGCCTCTT CATTTGCTGA 
124251 GGCAGGGTCT CCGTAGACTT 
1243 01 CCACCTCTTG CCGTGCCGTA 
1243 51 TTAACAGCTA CCCCTTGACC 
124401 AGTACCCAGG TAGTTATGAT 
124451 TAATGACTTG TTTTGCTCTA 

124501 AAGACAGGGG TAGAACTTTC 

124551 TGCAACTCTT CGCTCTAAAC 

124 601 AAGGGAAAGT TATGCACAAA 

124 651 CTTTTGAAAC AGTCTTAATT 

1247 01 TCGTAGATTC AGGAAAATCT 

12 4751 GATTTATTAG AATCTCTGCA 

124801 GTGGG AGAAT CCTTGTTCTC 

124851 AATTTTTACT TTCCAACTAT 

1249 01 CCTACGGGAG GGGGATTTGT 

1249 51 TCTTATGTCT GCGACACATC 

1250 01 ACCATACTGT AAACTCTTTT 
125051 ATCCAGGGAA TGTTAGCTCC 
125101 AGGAAATTTT TGTATGGCAA 
125151 ACAAGAAGAT AGGGGGCGCT 
125201 CATCAAGGAT CCTTATTCTT 
12 5251 ATTTTTAAAA CCCGAGGTTC 
1253 01 ACGCGTTTTT CCCTTTAGGT 
1253 51 GCGCGTCAGC AAGTCAAAGA 
12 54 01 GTTATGATGA GTCGGTTGCG 
12 54 51 TATTTTGCTG GTTCCTAATT 



TCCGTCCCGA CATAATTGTG CTCTAAAAGG 
TTCAAAAGAT TTTTTTACTC TTCCTGTAAG 
GAATTTCTGG ACCATAACCA ATCAGGCGTT 
TCAAAATCTA TACCGAGGTT GCGTAATACA 
AAGTTTGAGA AGACCAAGCA GGATGTGCTC 
TTAAACGCTG AGCCTCCTTT TTCGCCAGTT 
TTAGTGAACT TCTCAAACAT AAAAACCTAA 
CTTAAGCATA TACGAAATTT AAAATAATGA 
CAGCAAATTT GGTAAAATTC CTCTGAGTTT 
CCTTTTGTAT ATGATACAAT AGTTCAGCTT 
AGTTTTATGT TTGTTATATG AAAGTTCGTA 
TCAGCGGCCT CCCACATGGC TAAGGACAGA 
AGATGGGGAG CTCATTTTAC ACCTTTATGA 
TGACGTACGG TCACTTTATG CGTCCAGAAA 
GCGGATCTAG GATTGGACGC CGCAGTGCGG 
CTTCCATAAG GGAGATTATG CTTTTTCTGT 
CTTCCTATTC TTCTTCGGTA CTTGAGAACT 
GTAGCGAAGG TTCTAGAGAA AGTATTTCGG 
AGAAGACGAA AACTCTTCTT CCAGAGATTC 
AAACTTCGAA GTATGACGTT CTTTTTGGGG 
GCCCAACGCA AGGTGCAACA GGGATTTTTA 
ATCGGGAAGT TCTTCTGAGT TTTACCAGAG 
TTGAAGAAAT TATTGAACAA ATCCAGATTC 
TTGGAAGCTG CTGATGAAGT GCTGCAGGAG 
GGCGTTTATT AAATTGTTTT GTGGTGAGGG 
TTTTCGCTTG GCAGCTCTTG GAATATTTTT 
CTGTTTCAGC AAAGACAATC GTAGCTTCAG 
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125501 ACAAGGAGAA GGTTGGAGTT CTTGTTTATG ACAATAGTGT AGAGGCCTTT 

125551 CAACAGATAT TGGATTGCAT AGATCATGCA AATTTTTATG TAGAACTGTG 

125 601 TCCCTGCATG ACAGGAGGCC GAACGCTTAA AGAGATGGTA GATCACCTCG 

125651 AGGCTCGTAT GGATCTGGTT CCAGAGCTCT GTAGCTATAT CATTATCCAA 

1257 01 CCCACGTTTA CCGATGCTGA AGACCAAAAA TTACTCAAAG CTCTCAAAGA 

1257 51 ACGTCATCCC AACCGGTTTT TCTACGTTTT TACAGGGTGC CCACCCTCAA 

125801 CAAGCATCCT CGCTCCTAAT GTCATTGAAA TGCATATCAA ACTTTCTATC 

125851 ATCGATGGGA AATATTGTAT TTTAGGTGGT ACCAATTTTG AAGAGTTTAT 

125901 GTGCACTCCA GGGGATGAGG TTCCTGAGAA AGTGGATAAC CCACGTTTAT 

125951 TTGTCAGTGG AGTGCGTCGG CCCCTAGCAT TTCGTGATCA GGATATCATG 

12 6 001 TTGCGTTCTA CAGCATTCGG TTTGCAGCTC AGAGAAGAAT ATCATAAGCA 

12 6051 ATTTGCTATG TGGGACTACT ATGCACATCA TATGTGGTTC ATTGATAATC 

12 6101 CTGAACAGTT TGCAGGCGCC TGTCCTCCAC TGACTTTAGA ACAAGCCGAG 

12 6151 GAGACAGTAT TTCCTGGATT TGACAAACAT GAAGATCTTG TTCTTGTCGA 

12 6201 CTCTTCCAAG ATCAGGATAG TTTTAGGTGG TCCCCACGAT AAGCAACCCA 

126251 ATCCTGTGAC TCAAGAATAT TTGAAACTTA TCCAGGGAGC TAGATCTTCT 

126301 GTGAAGCTTG CTCACATGTA TTTCATCCCT AAGGACGAGC TTTTAAATGC 

1263 51 TCTTGTCGAC GTTTCTCATA ATCACGGTGT TCATCTGAGT TTAATTACGA 

12 64 01 ACGGCTGTCA TGAATTAAGT CCTGCAATTA CAGGACCCTA TGCTTGGGGA 

126451 AACCGTATTA ACTATTTCGC CTTGCTCTAT GGGAAACGGT ATCCTCTTTG 

12 6501 GAAAAAATGG TTTTGCGAAA AGCTAAAACC TTATGAGCGG GTTTCTATTT 

126551 ATGAGTTTGC TATTTGGGAA ACGCAGTTGC ACAAGAAGTG TATGATTATC 

12 66 01 GATGATGAAA TTTTTGTGAT CGGAAGTTAT AATTTTGGAA AGAAAAGTGA 

126651 TGCCTTTGAT TACGAAAGTA TTGTAGTTAT CGAATCTCCA GAAGTCGCTG 

1267 01 CAAAAGCTAA CAAAGTCTTC AATAAAGATA TCQGATTGTC GATTCCTGTA 

12 67 51 AGTCATGGCG ACATTTTCTC TTGGTATTTC CATTCCGTAC ACCACACTTT 

12 68 01 GGGACATTTG CAGCTGACCT ATATGCCAGC CTAGCGTCCC TGGGTGCGAA 
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12 6851 TCTACCAACA GGATCTCTTC TGCAGGCTCT GCAGGGATCC TGCCTGGTTT 

126901 TTTTCTCTGC TATCGTTTAC ACTACGCTTT TATTGTTTGG GTAGAGGGTG 

12 6951 GACCTTGTTA TCGTTCTTCT ATAAGCATCA AAAAAAATTT ATCGGCATTG 

127 001 TCATTGCTGT AGTTTGTGTT TCTTGGTATT GGAGTGGGTT GGGGACGATT 

127 051 CTCTAGAAAA GGTTCTGCAG AGTCCACCTC ACGTCGGACT GTTTTTACTA 

127101 CCGCTTCAGG GAAGCGGTAT GTAGAGAAAG ATTTCATGGC TATGAAGAAG 

127151 TTCTTTGCTC ACGAAGCGTA TCCATTTACA GGGAACCCTA GAGCTTGGAA 

127201 TTTTATCAAT GAGGGGCTAC TTACTGATTA TTTTCTAACG ACAAGGGTGG 

127251 GAGAAAAACT CTTTTTAAAA GTGTACCATC CGGGAGAGAA AATTTTTAGT 

127 301 AAGGAGAAAG CTTACCAGCC GTATCGTCGT TTTGACGCTC CTTTTATTTC 

127 3 51 CTCTGAAGAA GTTTGGAAAT CTTCAGCTCC CCAGCTTTTA GAGATCCTGA 

127401 AGGTCTTTCA ACAAATCGAG AACCCCATAT CAAAAGAAGG ATTTCTTGCT 

127 451 AGAGCCAAGC TCTTTTTAGA AGAGAGAAGG TTCCCTCATT ATGTGCTTCG 

127 501 ACAAATGTTG GAGTACCGCA GGCAAATGTT TGCTCTTCCC CCAGATGAAG 

127 551 CCTTATCTCG CGGGAAAGAC TTGCGGTTAT TTGGCTACCA GACGATTCAA 

127 601 GACTGGTTTG GGGATGCCTA CCTTTCTGCT GCTGTTGAGC TCTTGATCCG 

127 651 CTTTATTGAC GAGCAGAAAA AAGTACTTCC CAGGCCCTCA AAACAAGAAG 

1277 01 CTCGTGACGA CTTTTATGAT AAGGCGAAGC ATGCCTATAC TAAGATCAGT 

1277 51 AAGAATAAGG AATTTTCCTT AGGATTTGAA GAATTTGTAA ACTCGTATTT 

127 801 TCAGTTTTTA GAGATCTCTG AGTCCGAATT TTTCAATATG TATCGAGACA 

127 851 TATTGTTGTG CAAAAGAGCT CTTCTCCTAT TGCAGGGAGG CGTTTCTTTT 

127901 GACTTCCAAC CTCTAACTAC ATTTTTCGTT CAAGGAAAAG ATTCCATACA 

127 951 AGTAGAGTTC TTTAGACTCC CTAAGGAGTA TAGCTTTAAA ACAAAACAAG 

128001 AGTTAAAAGC TTTCGAAGTC TATTTAAAGT TAGTGAGTTT ACCTAAATCG 

128051 GATAGTTTGG ATGTTCCTAA TGAGATCCTT CCTATAGCGA CCATAAAAGC 

128101 TAAAGAGCCT CGGTTAGTAG GCAGACGGTT TTCTATAGAC TATAAGAGAG 

128151 TCGCTTTGCA AGACTTAGCA GCTACTGTAC CTATGGTTGA AGTGCTGCAC 
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128201 TGGCAACAAA ATTCTGAGCA CTTCCAGGAG ATTCTCCAGC AGTTTCCTGA 
128251 CGTTGAGACG TGTCAGTCGT ATAAAGACTT CCAACATCTT AAGCCTGCGC 
1283 01 TGCGAGATAA AATTTCTCTT TTCACACGCA AGGAAATCTT AAGGGCCCGC 

1283 51 CCTGAGAGAA TTCTGCAATC GCTACAGCAA GTTCCTAAGC AGAGCCAAGA 
128401 AGTTCTCTTA TCTGCAGGGA AGAATAGTGC TCTACCAGGA ATATCCGACG 

1284 51 GTCAGCAATT AGCCAAAGTG TTGCTTGAAA ACGAGGTTTT AGATTTATAT 
128501 AGCCAGGATG CAGAGACCTA TTATACTATT ATTGTTAATA GTTCTTTTGA 
12 8 551 AAAAGAAGAA GTGCTTCCTT ATCGTGAGGT TTTAAAGAGA GATTTGGCCT 
12 8 6 01 CACAGTTACT TACTTCTCAT GGTCATCTTG TTGACATGGA GCGTCTAGAA 
128651 TCTGCGTTGC GTACACGGTA TCCAGGAGAA GAAGGCGCTA GCCTATGGCA 
12 87 01 ACGACGTCTT TGGAAGGTAG TGGAAAACCA CAGATTGGGA AGGCATCTCG 
12 87 51 AGGGGTCTTT CTCTTGGAGC TTAGATCGCT CATTGAAGAC TTTTTCCCGA 
12 8801 GGAGACAAGG AGCTGCCCCA AGAGTTTGAT AGGATTTTCT CTATGAAGGT 
12 8851 AGGAGACTAT TCTTCTGTAT TCATGAGTCC TAACGAAGGG CCCTGTTATT 
128901 ATCAATGCCT CTCTCATTTA CTGTATGATC GTCCTGCTAG CGTGGATAAA 
128951 CTATTTTTAG CTAAAAGTCA GCTAGATGAA GAACTTTTAG GATCCTATAT 
129 001 GGAACGCTTT ATAGAACAGG GAGTCGTAAG GTGATGTGGT ATTCTGATTA 
129051 TCATGTTTGG ATTTTGCCCG TCCATGAGAG GGTGGTGCGC CTCGGGTTAA 
129101 CAGAAAAAAT GCAGAAAAAT TTAGGAGCCA TTCTCCATGT GGATTTACCT 
129151 TCAGTAGGGA GTCTATGTAA AGAAGGTGAG GTTTTAGTCA TTCTGGAATC 
129201 TTCTAAATCT GCTATAGAGG TGTTAAGTCC TGTATCAGGA GAGGTTATCG 
129251 ATATCAACCT TGATTTAGTG GATAATCCTC AGAAGATTAA CGAAGCTCCA 
1293 01 GAAGGTGAGG GATGGTTGGC TGTAGTCCGA CTAGACCAGG ACTGGGATCC 

1293 51 TTCTAATCTT TCTTTGATGG ATGAAGAGTA AATTTTTTAT TAGATATACT 

1294 01 CATTTTTTTC AGAAGATAAG AGGTATTTTT TTAAGGCTAA AACATTTAAA 
129451 ATTTATGTCT AAGGTTTAAA AAATACATCA GAATTATTCT ATGGATCCAG 
129501 CTAGTCCGGT AGCCCCTCAT GTCCTACAAG ATCATGTGCA ACTATCTTCT 
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129551 


GAAGAATTGT 


CCGCATTATC 


TTCCGGGGTA 


TCTCGTGTGA 


AGAAGPTTAP 


129601 


TATAGCCATC 


ATGGTCCTTT 


CATTGATAGC 


GATTTCTTTG 


GTAGCCTGTG 


129651 


GCCTATTTTT 


AACGGGATCG 


GCACCTCTAC 


AGCTCTCGAT 


CTGGATTGCT 


129701 


GCGAGTTGCA 


TTACCTTATC 


TATGTTAGTT 


TGTGCGTGTT 


GGCGTTATAA 


129751 


GATTTCCAAT 


GCCTTAGAAA 


AAACTAAGGT 


AGCGCATGAA 


AGCTGAGTTG 


129801 


GACATTTTAT 


TGATTGAAAA 


ATCATGACTA 


CATTACCTAA 


GTACGTTCCC 


129851 


CGTTCTCGAC 


AAAATCCCGA 


TACTCTGACC 


TTCCTAAAAC 


GGTATTCTAG 


129901 


TGTCCTTCTC 


CATTCGGAGA 


ATTCTTTATC 


TTATCGGATT 


TTTGPGAAAP 


129951 


TGCTTGCTAT 


TCTCCTCACT 


TCGTTAGCTG 


TAGCTTTCGC 


CGTGAPTTTP 


130001 


TTTTCTTGTG 


AAGGTTCTCA 


ACTGAGACTC 


TGCGPTCTPT 


AT AT APPT AT 


130051 


AGCTCTTGCT 


ATTTGTGTTT 


TACTGACGAT 


PGTTGTTTAT 

— ^J? x x \j J. J. j. rl x 


TPT A TP^r A a 


130101 


GTAAAATCGC 


CACAGCTTGC 


AAAAAGCCGC 


CTTPPATATP 


TPP A ATTPA A 


130151 


ATTGTTTAGA 


AGCATCTCTG 


TGTACAAAAG 


TTCACTAGAA 


ACTPGAPTPT 


130201 


AGGAAAGTTC 


CTAGAGAGAG 


CGACGCTGCG 


TTCGTGCTTT 


AGAAATAPTT 


130251 


GGCTGGTGTT 


TGGACGAAGA 


TTCTTTTAGT 


GGATTGGTCG 


TGTTTTPAAP 

x \j x x x x \^nTi\^ 


130301 


AACTTCAGTC 


TCTAGAGGAG 


CTCTTTGAAT 


CTTTGCTGAA 


GGTTTAPAA A 


130351 


TATCAATACC 


TGTTAAGCTC 


ATAAAAGCCA 


TAGCAATGAG 


GPPTPTTPTA 


130401 


ATGAAGGAGA 


TCCCCATTCC 


CTGGAGGTTT 


TTGGGAATAT 

x x vjwxjnn x jTi x 


PAPAPTAPPP 


130451 


GAGTTTTTCT 


TTGATAGTGG 


CTAAAATAAC 


AATAGCGAGC 


PAPPAPPP AP 


130501 


ATCCCGCTCC 


TAAAGAGAAG 


AT CAT CAT AG 


GAATAAAAGG 


ATA APT A PPT 


130551 


GTGATTCCGA 


AGAGCACACC 


CCCTAGGATC 


GCGCAGTTCA 


PAPPA ATP A A 


130601 


GGGAAGGAAG 


ATCCCTAAGG 


AGAGATATAG 




APPTTTTPT A 


130651 


AAAGAAGCTC 


TAAGATTTGC 


GTGAATGCCG 


CAATCACCAP 


GATQA A A ATA 


130701 


ATCAGCTCCA 


GAAAACCTAG 


GTTTACAGAA 


GCTAAAGATG 


GAGAGATCCA 


130751 


AGTTAGAGCT 


TT'AGGGCCCG 


TGATGAAAGC 


ATGGACAAAC 


CAGTTGATGC 


130801 


TCCCTGTTAC 


AGTGAGAACA 


AGGGCTACGG 


ACATCCCCAA 


GCCATTGGCT 


130851 


GTAGAAACCC 


TAGTAGAGCA 


AGCAAGGTAA 


CTACACATCC 


CCAAGAAATT 



130901 


CGCAAGAAGG 


ATATTCTGAA 


. TAAAGGCTGC 


: TTGTAGAAGA 


. ATACCAAAGA 


130951 


CATTAAGCCA 


AGTATACGCA 


CCTAACCACA 


TAAACTACCT 


TTTTCTCTTT 


131001 


TTAGAGTCTC 


GAATGTTAAC 


AAGCCAAATC 


ATAATACCAA 


GTAGGAAAAA 


131051 


AGCCGACGGT 


GCTAGCACCA 


TAAGACTTAA 


ATTTTGGTAT 


CCATCGGGGT 


131101 


GGGTTTCGGA 


AGCATAAACA 


AATTGAGGGA 


TGATGCGAAA 


CCCCATAAGA 


131151 


GTTCCAAAAC 


CAAAGAGTTC 


TCTGATGACT 


CCAATGACAA 


GTAAGACCCA 


131201 


GCCGTATCCT 


AAGCCAGAGG 


CAAACCCATC 


TAAGAACGCT 


GGAATAGGAG 


131251 


TCACATGCCT 


AGCTAGACTT 


TCAGACCTTC 


CCATCACGAT 


GCAATTGGTG 


131301 


ATGATAAGAC 


CCACAAAAAC 


AGAAAGTGTT 


TTGGAAATAT 


CAAAGAAAAA 


131351 


AGCTTTTAAA 


AACTGGTCGA 


TAACAATCAC 


AAACAAGCTA 


ATGATAATTA 


131401 


GCTGAGTAAT 


CATTCTCACA 


CTGTCAGGAG 


TGAACTTACG 


TAATAAGGAA 


131451 


ACAAAGAAAG 


ACGAGCATCC 


TGTAACAATG 


CTGACAGCAA 


TTCCCATAGT 


131501 


AATTGCCGTT 


TGTACTGTTG 


TTGTCACTGC 


CAGAGCCGAG 


CAAATCCCCA 


131551 


AAATCGCAAT 


GAGAATTTGG 


TTGTTGCTCC 


ATAGAGGATC 


AAAGAAATAG 


131601 


CTTTTATAGG 


ACTTTTTACT 


TGTCATTCGC 


CTGTTTTCTT 


TTCATGGGTT 


131651 


AAATTAGAAA 


AATTTATAAG 


GAGCTGACGA 


TAGCAAGCCA 


GAGATTGTAC 


131701 


ATAAGCTTCA 


GTGACACCGT 


TGCATGTTAA 


GGTGGCTCCA 


GAAATCCCAT 


131751 


CAATAGCAGA 


AAGAGCTTTT 


GGAGAATCTC 


CCAAAGTAGT 


ACGCACGGAA 


131801 


CCTTTAACTA 


CCTCAAGCCC 


TAGGTCTGTT 


GTTGCAAAAT 


TTGTAGTTCC 


131851 


AGAAGAATCT 


TGTAGGAAGA 


TTTTCTTCCC 


ATAGAATTGC 


TCTTGCCATT 


131901 


CGGGATTTGT 


AATATTTGCT 


CCTAAACCTG 


GAGTTTCTCC 


TTGTTGGTAC 


131951 


CATGCGGTTC 


CCAATACAGT 


GTCACCGTCG 


TTTTTCACTC 


CTAGATAGCC 


132001 


ATGGATGGGG .CCCCAAAGGC 


CGAATCCTGA 


TATAGGGAAG 


ATCAAAGCTT 


132051 


GAACTGTAGA 


AAGGTCTTTC 








132101 


CGAGAGGTAT 


TCTCTAAAAT 


GACATAAAAG 


GGGAGGGGGG 


ATTGCTGACA 


132151 


CGGAGGGCTT 


TCTTGATATT 


TCTCAAAAAA 


TTCAATGGGA 


TTCAGATTTT 


132201 


TTTCTTCAAA 


AGAAAATACC 


TTGCCTTGGG 


CATCTGTAAG 


TAGAGGACGG 



132251 ACAAAGCGCT CGGCATACAG CTCTAATTCA GGATAGGAAA CCTCAGAGAC 

1323 01 TTTTTTTGTA GCAACTTCAA GAAGTTGTGT TTTTTTATCG AAAGTCGCAG 

1323 51 GCACCCACTC TTTTTTTTCC TGAATTTGAA ATCTTCCTTT AAAATCTAAA 
132401 ATATGAGCAG CTAAAAGCAT TTGCTTATTG CGATCGAAAG TAGCAGCTTG 

1324 51 TTCCTGTATT GGGGAGAGCA CATAGTAGAT TGTGGATAAC AGCACTCCTG 
132501 CAAATAAGCT GAGGCCCAGG ATAAAGGAAA CGATGTACCA GGTTTGGTTT 
132551 ATGCGGACGG TATGTTTTGA AGAGCCTTTA GACATATTCT AGACTCCCCT 
132601 TTTTCTATAC TTTCTAACAG CAAAATAGTC GATAAGAGGG GCAAATACAT 
132651 TGCCCAGAAG GATCGCTAAC ATCACTCCCT CAGGATACGC AGGATTGATA 
1327 01 AGACGAATCA CAATAGTCAT' AAATCCTATA AAGAATCCGT AAATCCATTT 
1327 51 CCCTAATTTC ATAGTCGGCG ATGATACGGG ATCCGTAGCC ATAAAGACTA 

132 801 AACCAAAAGC AAGTCCTCCG AGGAAAAGCT GCCGATAGGC GGGAATGAAG 
132851 AATCGAGCAG GTGCCCAAGC TCCGTTTTGT CCCACGATGA GTACGCTGAT 
132901 AAACTTAAAG AGCCAGCCTG TGAGAAAGGC TCCTATCCCA AAGGCTGCCA 
132951 TGGTTCTCCA AGAGGCAATG CCTGTAACAA TAAGGAATAT TGCACCCAAC 

133 001 AGACAGGCGA AAGTGGAGGT CTCCCCCAGA GAACCTATAA TGTTTCCCCA 
133 0 51 AAAGAGATTC CCAGCTGAGA ACTTCCCAAT CCCATAGATC ACATCGGTAA 
13 3101 TAGCATAGGC AGAATCGAAC TGTGTGGGAA GCAGCCCCAA TCCTCCCTCA 
133151 GCAACAGGAG CTGTAACAAA CGTTTGAAGT TGTGTAAGAG TGAGATTATC 
133201 TAAAACCCAA CCAGGATGCG TCTCTGTCCA AAGAGAAAAT TGTGAGTGAA 

1332 51 TGACATCTTG AGTAGGGACG TGAGGAATGT GAAGCATATT TGCAGCAATC 

1333 01 GCATCGACAT GCAGACGCTT TACAGAGGGA GGTGTCGAAT TTAGAGTTTG 

1333 51 TAGGCAGGTA GACTGTGAAA ATCCATCAAT GAGTACTTTT CCTGTCGAGG 

1334 01 AGTTCATCTT CATGAGGCTA TCTTTAATCA CTCCGGGGTT GCTTCCTACC 
-1334 51 CAAACGTCAC CACTCATCTT TGCTGGAAAC GTAAAAAATA AGAATGCCCT 

1335 01 TCCTGATAGA GCAGGATTGA GGATGTTCAT CCCTGTGCCT CCGAAGAGCT 
13 3 551 CTTTACTGAC AACAATACCA AAGGCGATCC CTAAGGCTGC CATCCAGTAA 
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133 601 GGAATTGTCG GAGGGAGAGT AAGGGGATAG AGGATTCCGG TTACTAGCAG 

133651 TCCTTCTGCG ATTTTATGCC CACGAACTAC AGCAAATAGG ACCTCACAAG 

133701 TACCCCCGAC AACATAGCTA ATCGTAAGTA GAGGAATAAA GATCTTAAGT 

133751 CCTTCCCAAA GGATAGGAAC TATATGGATC TCTTTGTAAA CAAAGGATAA 

133 801 ATAACTACCA AATCCAGAAA TATGTAAGAA TTGCTCCATC AGCACAGGAT 
133851 TGCCTGAGCT ATAAACGATA GATTGAAGTC CTGAATTCCA GATCGCAACA 
133901 AAGGTCGCGG GAAACAAAGC GATAACAACA AGCATCATCC AACGCTTAAC 
133951 ATCTACAGAA TCGCGGATGA AAGGAGGCTT GGAAGGGGTT TCAATAGGTT 

134 001 CGTAACAAAA TGTATCTATC GCATCGACAA TGGGAGTAAA GCGCTGATAC 
13 4051 TTGTCTTGTT GACATAGTTT CCAAAGAGAA TTTATGAATT TTTTGAGCAT 
134101 TGTGATTGAG AGAAAAGTTG AAGCTTCTAC ACGTTCAAAA ACGTAGCAAA 
13 4151 CTGCTTAAAA TTTTAGGAAT AAAAATTTTC ATGATTCAAA AATAGATGGT 
13 4201 ACTTTTTTCG TTGCTGTTTC CAAAGTTATG TTATGGCTGT CAAGCTCCAG 
13 4251 GAGCCTACTT TTGTTCCAAC TGCTTGGAAA AACTTCTCGT AGAAGATAGA 
13 4301 GAAGGGCGTT GTCTACATTG TTTTCGTTAT CTTGGTTCTT CCGAAACACG 
13 43 51 TCTATGTAGC CAGTGTTCAC CCTCTTCACA ACTTCAAGCT TTCAGCTTGT 
134401 ACCTTCCTTC GCAAACGGCC CTCTCGGTAT ATGCTCGTGC TTGTGAAGGT 
134451 AAGCGACCCG CTCTGCAGTT TTTTTCTAAG AGTATCGCCT TTGAGCTAGC 
134501 TTCACTGGAT GAGACTCCGA GTTGTATTGC CTATATAACA TCGACAATTT 
134551 CTAGGAAAAT CGTAGTAGAA GTTGCTAAAC TAGAAAAGCT TTTACGCATT 
134601 CCCTTGTGGC CGTGGCTTCC TAAGAAAAGA CAAATAGAAA AACTTCCTAA 
13 4651 AGGGGAAGGT ATCTGCTTTT TGTCGGCCTA TCCTTTATCA CAAAAATGGA 
1347 01 TGCAAACTAT CGTTGGAGGG AGTGCATCAC CTCTAGTATC TATAAGTCTC 
134751 TTTCTCTCTC AGAATGATCA GTAATTCCTG CAATTGCAAG GTAACCAAGA 
134801 ATACGTACGC CCTCATCAAC ACACAACGTG TTCGCTTGGA TGTCTCCTTT 
134851 AATGATTGCG CCTCCACGGA GTTCGACTTT TCCAGATACT GTGATATTTC 
13 4901 CTTCTACAAC CCCTTCAATA ATGGCTTCTT GTAGCTGAAT ATCTGCCTTT 
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134 9 51 ACCACTCCTT TAGGACCGAT AATAATTTTT CCTTTTGAGA CTAAAATGCC 
13 5001 TTCAAAAGTT CCGTCAATAC GTAGGAGACG TTCAAAAGCA AGTTCTCCTT 
135051 TAAAGGTGAC GCCTTCTCCT AAGGTAGTTT CAGGTTCTTC AAGAGGGAGT 
135101 AGAGATTCTG TTCTTGGAGT TGAGGACCAT TGAGGAAGAG AAGATTCTTC 
13 5151 AGTTAAATTG TGATTCAAAG GGCGAGCTTC CG AAGCTTTA GGGTTGTCAA 
13 5201 AAAGACTTGG AGGGGTCTCT GGGCGCTCGG ATCTTGAATA TGGCGAGTAG 
135251 CTGGAAGGTG AAGAAGTTTC TTCTTCGTAA AGTGTTTGCA CATCTTCAAA 
13 53 01 AGGACCTTTT CCTGTTCTAC GGAACATGGG ACACCCCCTA AATTAACCAA 
13 5351 CAATATGATT TTTACAGAGA TTTACTTGAC GCTTAAGAGT TTCGTCATTT 
13 5401 TGTAATTTAT GTTCTATAGT TTTACAGGCA TAAAGTACTG TCGAATGAGT 
13 54 51 TTTACCAAAA GCAGCTCCTA TTGCAACTAA AGAATCTGTA ATAAGAGTTT 
13 5501 TTGCTAAATA CATAGCAATT TGCCGAGCTA ACACAAGATC TTTAGAGCGT 
13 5551 GAGTTTCCCT TAAGATCATT CAGCTTTACT TGGAATACTG TAGCAACACT 
13 5601 TTTTAAGATC GTTTCTACAG AAATTTTTTG TTTTGTTGGA GAACGGAAGA 
13 5651 GCTCTTTTAG AGTTTCTCGG ACTGTAGTTT CTGTAAGAGA CTTGCCGAAA 
135701 AGACGACAAT AGGCAGTCAG CTTGTTGATA GCTCCTTCCA ATTGACGGAC 
13 57 51 ATTGCCATAG ATGTGATCCG CAATATAAAA TGCCATTTCA TTAGGAATGA 
135801 GCAATCCTTT TTGCTCCGCC TTGTGCTGTA AAATCGCAAC CCGAGTTTCT 
13 5851 AAATCAGGGA TGCCGACGTG AGCAACCAGT CCCCATTCCA TTCTAGCAAT 
13 5901 GATACGCTCG GAAAGTTTGA GCTGACTTGG AGGTTTATCA CTGGTAATTA 
13 5951 CAATTTGCTT ACTCAGGTTG ATCAAAGTCT CAAAGGTATT GCAAAACTCT 
13 6 0 01 TCTTCAAAAT TTTGGCGATT CTGTAAAAAT TGAATATCAT CAACAAGAAG 
136 0 51 TAAATCTAGG GAACGATAAA AATTTTTCAT TTTATCAACA GACTTGGATT 
13 6101 TGAGATGGTA GACAAGATCG TTGATAAACG CTTCTGTAGT GATGCAATGG 
13 6151 ATGCGTAGAT TTTTATGATG TTCTCTTACG TAGTGACCTA CGGCATGAAG 
136201 TAAATGCGTT TTGCCTAATC CCACACCCCC ATGGATGAAT AAAGGGTTGT 
13 6251 AGGAGCGGCC AGGTTTCCCA GCAATACCTA CAGCTGCAGA CTTCACAAAT 

101 



13 63 01 TGATTTGAGG GACCTTCAAT GAAATTATCA AAGCGATAGG AGAGATTCAG 

1363 51 CTTTAATTCA AAATCTTTAG TTTCTTCAAA GACCTCAGAA ATTCCTTCGT 

13 6401 TTGATTCTTT TTGAGAAGCC ACGGGGGCTG AAGGTTTCTT GTGTTCTGCA 

136451 ACTACAAATT CTAAAGCAGG CTCTCCATGA ACATCTAAGG GGACAAAAGA 

13 6501 ACAGAGGTCT CTTTTGTAGT TATCAAGAAG ATAATTTTGT ACAAAAATGT 

136551 TGGGGACTTC TAAGCGAATT TTCTCTTGAG TTTCTTCAAG AACTTGAATA 

13 6601 GGAGAAATCC AATTTTCAAA AGCCGTTTTC GAGCAACGTG TCTTAACATA 

13 6651 ATTTAAAAAC TGTTCCCAAG TAGTGCACTC GTTACAGGTT AACATGCCGC 

1367 01 TCTCTTTATT TATAAAGCTT TCCCAAATAC AATCGACCCA TCCCATGAGT 

13 6751 GATGGCGAGA AAATCTCATT GCATCTTGAC TATTCATTGC ACTGAGTGCA 

13 6801 ATGACTCTTC GCGTTCAGAT TGGATCCTGC ATTCCCTGGG TGAAGTTCCT 

13 6851 GTTAGGAATG GATCCTATAC ACCCTTCCTC CGCAGGTAAA CGCGGTACGC 

13 6901 TCTGCCGATA AAATAACTCT ACCGATTACT AGGTTTTAAG GCAAATTGGA 

136951 TCGTTGGTTT CGTTAGGCAA TAAGGAACCA CAAATTCAGG AAAAAATAAT 

137 001 TATGAAATTT TGTAATAAAA ATGGAAAAAG AACTAAAGAA ATCCGGAGTT 

137 051 CTTCAATCAC GAAATACGTT TTCTATAGGA GAAAAAATTA ACGAACTAAC 

137101 GCAGCATTTT TTTTGGTTGC TTTACTATAA CTCATCAATA GAGCTTCAGC 

137151 ATCATTAGCA ATTGCTGGTA TGGGACAGGA TGAAAGGTAG GTGGCAATGG 

137201 CAGTAGCTTC TTCAATTCGT CCCAAACAGA AGAGAGCTTT TGTTTTATTT 

137251 AAGAGTGTAG GCAGATGATC TCCTTGCATG CGGAGTGCCT GATCTAAAAC 

1373 01 AGCAAGCGCC TGACTATTTT CACCAATTTG GAGATAAAGA CCTCCAAGAG 

1373 51 TTTGATGATC ATAGATACTT AAAGGATCTA AGATCACTAG AGCTTCAAAA 

1374 01 AAAAGAATCG CTTTTGAATA ATGCCCTTGG CGTAGAAAAG AATATCCTGA 
137451 GATTCTGAGT TCTTCTAACT CATCATCTCC CCAGCCTAAG ATTGCTTTCC 
137501 ATTCATTATC CAACATACAT TATCCTTGAA CAAATTGAAA GATACGAGAG 
137551 ATCACATAGT CTATCATCAC ATTTGTAGAC TTGACTCCAA GATCTAGAGC 
137 601 TTTTAAGAGC ACTTTCCCTT GTTGTACTCT AGGATCGTCA TCGTCTTCTT 
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137 651 CAGGATCTTC GTCTTCCTCT TCTTCTTTAT CTTTGTAAGA CAAAAAGGGA 

137701 GTAATTTGAC TGCGATAGGA AAACTTCCCT CGAGTGAGAA CTTTTAAAAA 

137 7 51 TGAGGAGATT TTTTCTATGT CTTCATCTTG TTGGTCTGGA GATCCTAAAG 

137 801 AAGGTGCCAG GTAGGGTGTG GAAAAACGCT GTTTGTAAAA ATTATTTGGA 

137 851 GGGGAAAAAC ATGCCCAGTG CGACTTTTGA TTTGTCTGTA AAAGAGACGT 

137 901 CAAAGCCGAA GGCTTGGGGT TCATATCCAA AATTTGCGCA TGCTTGGCAA 
137951 CATCACGAAT GGAGATGCCT TCCATCTGGA TTTCTTTGCG AAAGTCGCTG 

138 0 01 ACTATCCTAT TATTGGAAGC ATGTTGCTCA TATATAGACG TGCTATAATT 
13 8051 AAAAATTTCT ACCATGGCGA GGCCTTAAAA AACCGTCTCT ATCTCTAGAT 
138101 GATAGTGCGG TCTAGGAAAA AGTCAATAGT CTTGATCAGA AGCCTGAACC 
138151 TTTTCTCCTT ATCGAAGGCA CTTAAGAAAA AAGCTCACCC CTATCAAAAA 
13 8201 ATTTAGAGTG GCGAACTAGC GAGAATTTAA GATAAGGGAA GGCTTTCTTC 
13 8251 TTTCTATGGA ATCCTGTATC TTTGTCCTTA TTTGAAGCCA AGAGCTTAGG 
1383 01 AAATTCTTAT GTCCGAACGT GCGCATATTC CCGTATTAGT TGAAGAATGT 
13 83 51 TTAGCTTTAT TTGCTCAACG TCCTCCACAG ACTTTTCGAG ATGTCACCTT 
13 84 01 AGGAGCTGGA GGACATGCGT ATGCTTTTCT TGAGGCGTAT CCCTCTCTAA 
13 84 51 CTTGTTATGA TGGCTCCGAT CGAGATCTTC AGGCTTTGGC AATTGCAGAA 
13 8501 AAACGTTTGG AGACCTTTCA AGATAGAGTC TCCTTTTCCC ACGCCTCTTT 
13 8551 TGAAGATCTT GCGAACCAAC CCACTCCACG TCTTTATGAC GGAGTTCTTG 
13 8601 CAGATTTAGG AGTCTCTTCT ATGCAGCTGG ATACTCTATC CCGAGGGTTT 
138651 AGCTTTCAAG GGGAAAAAGA AGAGTTGGAT ATGCGTATGG ATCAAACGCA 
138701 AGAGCTTTCC GCTAGCGATG TCCTGAACTC CCTAAAAGAA GAAGAACTAG 
138751 GGAGAATTTT TCGTGAATAT GGAGAGGAAC CACAATGGAA ATCTGCAGCT 
13 8801 AAAGCTGTTG TCCATTTTCG TAAGCATAAA AAAATTCTTT CGATCCAGGA 
138851 TGTAAAAGAA GCTCTTCTTG GCGTTTTCCC TCACTATCGT TTTCATAGAA 
138901 AAATACATCC ACTCACCTTG ATTTTTCAAG CTCTACGTGT TTATGTGAAT 
13 8951 GGAGAGGATA GACAATTGAA AAGTTTACTA ACATCTGCTA TATCTTGGCT 
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139001 
139051 
139101 
139151 
139201 
139251 
139301 
139351 
139401 
139451 
139501 
139551 
139601 
139651 
139701 
139751 
139801 
139851 
139901 
139951 
140001 
140051 
140101 
140151 
140201 
140251 
140301 



GGCTCCTCAG 
GTCCTGTGAA 
GTAATCACAA 
TCCTAGATCG 
GAACAAAAGT 
GTCTCTTTTA 
CTCGAAATTC 
TATTTCTTTA 
TGGAAATAGC 
GAAAGTATCA 
TTGTTCTAGG 
AAAATTCAAA 
ACACGAATTT 
ACACGACAGT 
GATATTACAA 
TCATCGTGAT 
CCTACGACGA 
CTGTATCCTT 
GAAAGGATAT 
TTACGGACTA 
CTCCATACCT 
CACAGGCGGG 
GAGAGAGAAA 
GTTATCAAAC 
TGTGATCCAG 
CTAAAGCCCA 
ATTCTTGCAC 



GGACGGCTTG 
GTGGTTTTTT 
AGAAAGTGAT 
AGATCAGCAA 
CGTTTTTTAC 
TTTCTATATT 
CTTGTTTATC 
CGTTTTTTAA 
AGCTCTTCCC 
GTCTTTTATC 
AGTGTTTGCT 
TTTGTGAAGG 
TGTGTCCGTG 
ACGTAAGGGA 
AATTTCACCT 
G AG AT CAT CC 
CCTCTCCCTA 
TATTAGATGT 
GCAACAAAGC 
CCAACGCTCG 
TAAGAGAAAT 
ATGGAGGCGT 
'GCTGTTGCGT 
TGCCTAAAGA 
ACCATTGCAG 
GGGGGGTAGG 
TGGCTCAATA 



TCATTATTTC 
AAAGAGGCGG 
CCAACCTACC 
AACTACGGTG 
GTTTATGCTG 
AATAAGCAGA 
TGTACGCTTG 
TTGATAAAAT 
GAATACCAAT 
CTATGAGCTA 
CTTTATGCTC 
AGACCACTGG 
ATCCTTTTCG 
GACAAAGACC 
TTGTGCAGAT 
AAGGGATTCT 
AAGTTAGATA 
TTCTGTCCAT 
ATCGCTTACC 
TATCCTTTTG 
TAAGGATGAG 
ACTTTAATCA 
TCTCCTTTGA 
TGGCTCTGAT 
AGGAAGAACT 
CTCATTCTAA 
TCCGTTTTTC 



TTTTTGTAGC 
AAGCTTCTGG 
TACCAAGAAG 
TTTTGAAAAA 
CTGTCTATGC 
ACTCGCTGAC 
CGTCAGCTTG 
AGAAAGACCT 
ATTTGGAATA 
CCGTAAACGT 
TTCTAGTATT 
GCCGCAGAAG 
AAGGGGCACC 
TTCAGCAGCC 
CCTTTAGCTA 
CCAATTTATT 
AGAAATCTCG 
GACCGGCTAT 
AACAAACGCC 
GGAAGCTCCT 
AAAACAGGAA 
TATTCTGGAA 
ACCGTTTAGA 
ATCTACCTTA 
CGAACGGGGC 
TGAACTCCCA 
GATCCCACAA 



TCTGAGGATC 
CCTGGGGAAG 
TACGAAGAAA 
GCTTCCCAAT 
TTTTGTGGAA 
GAAATTACGC 
AGCAGCAAAA 
GATCATTTGA 
TCCCTCAGAA 
TCGACTCTAA 
GCGTTATTAT 
CTCTCGGGCA 
TTTTTTGCTA 
TTTCGCTGTC 
TTCCCGAATG 
GAGGGGCAGA 
GTATTGTAAG 
CCCTTTGGTG 
CTATTTTTTA 
TGGACAAGTT 
AAGCCTTTCC 
GGGGACGTTG 
TACGAATCGT 
CGATCAATCC 
GTGCTAGAAG 
AACAGGAGAG 
ATTATAAGGA 
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14 0351 ATACTTCAAT AACAAAGAGC GCATCGAACA TACGAAGGTA TCTTTTGTGA 

1404 01 GCGATGTTTT TGAACCCGGG TCGATCATGA AACCTTTGAC TGTGGCGATT 
14 0451 GCTTTACAAG CTAACGAAGA GGCTAGCTTA AAATCGCAGA AAAAGATTTT 
140 501 TGATCCTGAA GAACCTATCG ATGTGACCAG GACACTCTTC CCTGGACGAA 

1405 51 AAGGATCTCC GCTTAAGGAT ATTTCTAGAA ACTCTCAATT GAATATGTAC 
14 0601 ATGGCTATCC AGAAATCTTC GAATGTCTAT GTAGCTCAGC TGGCTGACCG 
140651 CATCATACAA TCTTTAGGAG TGGCCTGGTA CCAACAGAAG TTGCTAGCTC 
14 07 01 TGGGATTTGG AAGAAAAACA GGGATCGAGC TTCCCAGTGA GGCCTCTGGT 
140751 TTGGTGCCTT CTCCCCATCG TTTCCATATT AATGGTTCCC TGGAATGGTC 
14 0801 CTTATCTACT CCATATTCTT TGGCTATGGG ATATAATATT TTGGCAACAG 
140851 GGATACAAAT GGTTCAAGCC TACGCTATCC TTGCAAACGG AGGTTATGCC 
140901 GTCCGGCCCA CTTTAGTAAA AAAGATCGTC TCTGCTTCAG GAGAGGAATA 
140951 TCATCTTCCT ACTAAAGAGA AGACACGACT CTTTTCAGAA GAAATTACTA 
1410 01 GAGAAGTTGT TCGTGCCATG CGTTTTACAA CGTTACCCGG AGGTTCGGGA 
141051 TTTCGAGCCT CTCCTAAGCA TCACTCTAGT GCTGGGAAAA CAGGAACTAC 
141101 AGAAAAGATG ATTCATGGAA AATATGATAA ACGCCGTCAT ATTGCTTCTT 
141151 TTATAGGTTT TACTCCCGTA GAGAGCTCGG AGGGAAATTT CCCACCTTTA 
1412 01 GTGATGCTCG TCTCCATAGA TGATCCTGAA TATGGTTTGC GAGCCGACGG 

1412 51 CACGAAAAAT TATATGGGGG GGCGTTGTGC GGCACCCATT TTTTCTAGGG 

1413 01 TTGCTGACCG CACACTCCTC TATTTAGGGA TTCTTCCAGA CAAGAAGCTA 

1413 51 AGAAATTGCG ACGAAGAAGC TGCTGCATTA AAGCGTCTCT ATGAAGAATG 

1414 01 GAATCGTTCT CCGAAACAAG GGGGAACGAG GTGAGGATCT CTATTTCCAT 
141451 CTTGCTATAG ACTTTTACCG TTGAGCAAAG ACTCTCTATC AGAGAGCCCG 

1415 01 TCTCCTCTTT ATCCTCTATG AGTAGTTTAT GTTATGGCTA GGGTAGGTCC 

1415 51 TAAACTATAG AAATAACTTT AGCTTTCTTC CCCTAAATAA GAGACCAAAG 

1416 01 TCTTGATGAG ACGGTCTATT GAAGTTTATG GAAGGGGGAG GTAAGGCTGT 
141651 GTGTTTGGGG ATTTAGATTT GGGATAAAGG AGGCTTCTGT TCGTAGAAAC 
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141701 


AGGAGAGCGA 


AATTTTATAT 


TTCAGAGAAG 


AGTAAGAACT 


TTATGGACAG 


141751 


TTTTTTGTGA 


TTGCTTGTAT 


ACTATCTTGA 


TTGAATTTTT 


TGTCGACCTA 


141801 


CGAGTAAAGA 


AATCCTTTAA 


GCATTTTTTA 


AAAATCAGAG 


TGAGAGCATG 


141851 


CCCCTAGAGG 


GCTTTTTATG 


AAAAAAGTTG 


TTTTTCAATA 


GTCCCTGGAG 


141901 


CGTAAATGGA 


TTTAAAAGAG 


TTACTCCATG 


GGGTTCAAGC 


TAAAATCTAT 


141951 


GGGAAAGTTC 


GCCCTCTTGA 


AGTGCGCAAC 


TTGACACGTG 


ATTCCCGTTG 


142001 


TGTGAGTGTT 


GGCGACATTT 


TTATAGCCCA 


TAAGGGACAG 


CGCTACGACG 


142051 


GAAATGATTT 


TGCTGTCGAT 


GCTTTAGCTA 


ATGGAGCAAT 


TGCCATTGCT 


142101 


TCTTCACTAT 


ACAATCCGTT 


TCTTTCCGTT 


GTTCAGATCA 


TCACTCCTAA 


142151 


TCTCGAAGAA 


TTAGAGGCTG 


AGCTTTCTGC 


AAAGTATTAC 


GAATACCCTT 


142201 


CAAGTAAGCT 


CCATACCATT 


GGGGTGACTG 


GAACCAATGG 


GAAAACTACA 


142251 


GTTACATGTT 


TGATTAAAGC 


TTTATTGGAT 


AGCTATCAAA 


AACCTTCAGG 


142301 


GCTTTTAGGA 


ACCATAGAGC 


ATATCTTAGG 


AGAGGGGGTG 


ATTAAAGATG 


142351 


GGTTTACTAC 


ACCTACACCC 


GCTCTTTTAC 


AGAAGTATTT 


AGCCACTATG 


142401 


GTACGTCAAA 


ATAGAGACGC 


TGTTGTTATG 


GAAGTCTCTT 


CTATAGGACT 


142451 


TGCCTCTGGA 


AGAGTAGCCT 


ATACCAATTT 


TGATACAGCA 


GTTCTGACTA 


142501 


ATATTACCTT 


AGATCATCTC 


GATTTTCATG 


GCACATTTGA 


AACCTATGTT 


142551 


GCGGCGAAAG 


CCAAGCTTTT 


CTCTCTCGTG 


CCCCCTTCGG 


GAATGGTTGT 


142601 


TATCAACACA 


GACTCTCCCT 


ACGGTTCTCA 


GTGTATTGAG 


AGTGCAAAGG 


142651 


CACCGGTCAT 


CACTTATGGT 


ATAGAGAGTG 


CTGCTGACTA 


CCGAGCCACC 


142701 


GATATCCAAC 


TTTCTTCCTC 


GGGAACAAAG 


TATACCTTGG 


TGTACGGGGA 


142751 


CCAAAAAATT 


GCGTGCTCTT 


CCTCATTTAT 


TGGAAAGTAC 


AACGTCTATA 


142801 


ACCTACTTGC 


TGCGATCTCT 


ACAGTACATG 


CAAGTTTGCG 


TTGCGATCTT 


14zo51 


GAAGATTTGC 


TAGAAAAGAT 


AGGCTTGTGT 


CAACCTCCTC 


CAGGTCGTTT 


142901 


GGATCCTGTA 


CTTATGGGTC 


CCTGCCCTGT 


ATATATTGAT 


TATGCACACA 


142951 


CCCCCGATGC 


TTTAGACAAT 


GTCTTAACAG 


GATTGCATGA 


GTTACTTCCT 


143001 


GAGGGGGGAA 


GACTGATTGT 


TGTTTTTGGT 


TGCGGTGGAG 


ATAGAGATCG 
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143051 


CAGTAAACGG 


AAGTTGATGG 


CCCAGGTGGT 


AGAGCGTTAT 


GGTTTTGCTG 


143101 


TTGTAACTTC 


AGATAACCCT 


AGGAGCGAGC 


CTCCTGAAGA 


TATTGTGAAT 


143151 


GAAATTTGTG 


ATGGGTTTTA 


TTCAAAAAAC 


TATTTCATCG 


AAATCGACAG 


143201 


AAAACAAGCA 


ATTACATATG 


CTCTGTCTAT 


TGCCTCAGAT 


AGAGATATAG 


143251 


TGTTAATAGC 


GGGAAAAGGG 


CATGAAGCTT 


ACCAAATATT 


TAAACACCAA 


143301 


ACAGTTGCGT 


TCGATGATAA 


GCAGACTGTT 


TGTGAGGTAC 


TCGCTTCCTA 


143351 


TGTCTAAGCA 


ACTGTCGTTT 


TTTGCTTTAT 


GTGTGTTAGG 


AAGTCACCCG 


143401 


ATTTTTGCTC 


AAACACCGAA 


TCCTCCTCAG 


CGTGTACGAC 


GCAGTGAGGT 


143451 


TATATTTATA 


GATCCTGGAC 


ACGGGGGAAA 


AGATCAAGGC 


ACGGCAAGTA 


143501 


AGGAACTTCA 


TTATGAAGAG 


AAGTCCCTGA 


CCCTGTCTCT 


TGCTTTGACG 


143551 


GTTCAAAGTT 


ACTTAAAGCG 


GATGGGTTAT 


AAACCTCAGC 


TAACCCGATC 


143601 


TTCTGATGTA 


TACGTTGACT 


TAGGGAAACG 


CGTTGCTTTG 


TCGAACCGTG 


143651 


GGCAGGGGGA 


TGTCTTTATC 


AGCATCCACT 


GTAATCATTC 


TTCAAACGCA 


143701 


GCAGCCTTTG 


GCACCGAAGT 


ATATTTTTAT 


AATGGTAAGG 


TCGGATCTCC 


143751 


GACTAGGAAT 


CGCATGTCAG 


AAGTACTGGG 


AAAAAACATT 


TTAGCTGCTA 


143801 


TGGAAAAAAA 


TGGCATTTTG 


AAGTCTCGAG 


GTTTGAAAAC 


TGCGAACTTT 


143851 


GTTGTGATTA 


GAGATACTTC 


TATGCCTGCA 


GTTTTGGTGG 


AAACCGGGTT 


143901 


TTTATCCAAT 


AGTCGTGAAC 


GTGCGGCCCT 


GCAAGATGCT 


CGCTATCGTA 


143951 


TGCATGTAGC 


GAAAGGCATC 


GCCGAGGGAG 


TTCATAATTT 


TCTTTCTGGA 


144001 


CCTAGTTTTC 


AGAAACCAAA 


ACAGAATATC 


GCTAAAATAC 


GTAAACCACA 


144051 


GATACAAGCA 


AATTAGTACT 


TTAGGAGTTA 


AAGGCAAAAA 


ATCGTCCTCG 


144101 


ATGCGATTCG 


AACGCATGGC 


CTGCTGCTTA 


GGAGGCAACC 


GCTCTATCCT 


144151 


GCTGAGCTAC 


GAGGACGCAA 


AGACCAGCAC 


TTTACCAAGT 


TAGATTAAAG 


1 A A 0 Pi 1 


AAG I CACGTG 


TTAGATGCAC 


GCTAGCAATT 


TAGGGGAAGT 


TTTTCTCAAG 


144251 


ATGTGGGAAT 


GATTTTTCTA 


GGTTCTAGAA 


ATATAGTTAT 


TCGCATTAAT 


144301 


CGATATGGTT 


TATAGTGATT 


GCGCATTTTT 


TAAAAAATGT 


CTTGAATCCA 


144351 


AAGGATGAAT 


AGATATGATG 


AGCTTAGACT 


TCAAAAGTTT 


ATTATTTACG 



1444 01 GTTACTCAAC GAGCTACTTT AGGGCACTTT AATAGGAGGC ATTGTCTAAT 
1444 51 ATGGCTACCA TGACAAAGAA GAAACTAATC AGCACGATCT CACAAGATCA 
144501 CAAAATTCAT CCTAATCACG TACGTACCGT GATTCAGAAT TTTCTAGATA 
144551 AAATGACCGA CGCCTTGGTT AAAGGTGACA GGCTTGAGTT TAGAGATTTT 
144 601 GGTGTGTTGC AAGTAGTAGA AAGAAAACCA AAGGTAGGAC GTAATCCTAA 
14 4651 GAATGCAGCA GTCCCCATTC ATATTCCTGC TAGACGCGCT GTAAAGTTTA 
1447 01 CTCCAGGGAA AAGAATGAAG CGCTTGATAG AAACTCCGAA TAAGCATTCT 
144751 TAATTCTTGT AGTCTTCTTT GTCTCAGTTG TTAGAGTCAG ACCGGTTTTT 
144 801 TACCGGGCTT GACTCTAATT TTTGTTATTA TTATCGTTTG GTGCAATGCT 
144851 TTTCTGATCA AATTGTGCGT GATAATGGGG CTGCAATCCA GGTTACAACA 
144 901 TTGTATAGAA GTGTCCCAGA ATTCGAACTT TGATTCACAA GTAAAACAGT 
14 4951 TTATCTATGC GTGCCAAGAT AAGACATTAA GGCAGTCTGT ACTCAAGATT 
145001 TTCCGCTACC ATCCTTTACT AAAAATTCAT GATATTGCTC GGGCCGTCTA 
14 5051 TCTTTTGATG GCCTTAGAAG AAGGCGAGGA TTTAGGCTTA AGCTTTTTAA 
14 5101 ATGTACAGCA GTACCCTTCA GGTGCTGTAG AACTGTTTTC TTGTGGGGGA 
14 5151 TTTCCTTGGA AAGGATTACC TTATCCTGCA GAACATGCGG AATTTGGCCT 
145201 ACTCCTGTTA CAGATCGCAG AGTTTTATGA AGAGAGTCAG GCATACGTCT 
145251 CTAAAATGAG TCATTTTCAA CAGGCACTCT TTGATCACCA AGGGAGCGTC 
1453 01 TTTCCCTCTC TCTGGAGCCA GGAGAACTCT CGACTCCTAA AAGAAAAGAC 

1453 51 AACTCTTAGC CAATCGTTTC TCTTCCAATT AGGAATGCAA ATTCACCCAG 

1454 01 AATACAGTCT TGAGGATCCT GCACTAGGGT TCTGGATGCA AAGAACGCGT 
145451 TCTTCATCCG CTTTTGTAGC CGCTTCAGGA TGTCAAAGTA GCTTGGGAGC 
145501 GTATTCCTCA GGGGATGTCG GTGTTATCGC TTATGGACCT TGCTCTGGAG 
145551 ACATTAGTGA TTGTTATTAT TTTGGATGTT GTGGAATCGC TAAAGAGTTC 
145 601 GTGTGCCAAA AATCTCACCA AACTACAGAG ATTTCTTTTC TCACCTCTAC 
14 5 651 AGGAAAGCCT CATCCCAGAA ATACGGGATT TTCCTACCTT CGAGATTCCT 
1457 01 ATGTACATCT GCCGATCCGC TGTAAGATCA CTATTTCCGA CAAGCAATAT 
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1457 51 CGCGTGCACG CTGCGTTGGC TGAGGCCACC TCTGCCATGA CGTTTTCTAT 

145801 TTTCTGTAAG GGGAAGAATT GTCAGGTTGT TGACGGCCCT CGCTTGCGCT 

145851 CCTGTTCCCT AGATTCTTAT AAAGGTCCCG GAAACGACAT TATGATTCTT 

145901 GGGGAAAATG ACGCAATCAA CATTGTTTCT GCAAGTCCCT ATATGGAAAT 

145951 TTTTGCTTTG CAAGGCAAAG AAAAATTTTG GAATGCAGAC TTTTTGATTA 

146001 ATATTCCTTA CAAAGAAGAG GGCGTCATGT TAATTTTTGA AAAAAAAGTG 

14 6051 ACCTCTGAGA AAGGAAGATT CTTTACGAAG ATGAATTAAT TTTGGGTCTG 

146101 TAATTGTGTT TAAGAATTGT TTGTATTAAA ATGATTCTTT TTATACGAGG 

146151 AGAGCACATT CTAATGGAAC TTCTTCCACA CGAAAAACAA GTAGTTGAAT 

146201 ATGAAAAGGC TATAGCCGAA TTTAAAGAAA AAAATAAGAA AAATTCTCTC 

146251 TTATCTTCTT CAGAGATTCA GAAATTGGAA AAGCGTTTAG ATAAATTAAA 

14 63 01 AGAAAAGATC TATTCGGATT TGACTCCTTG GGAGCGTGTA CAAATATGTC 

1463 51 GCCACCCTTC GCGTCCCCGT ACTGTCAACT ATATTGAAGG GATGTGTGAG 

1464 01 GAGTTTGTCG AGCTTTGTGG AGATCGCACC TTCCGAGATG ATCCCGCAGT 
146451 TGTTGGTGGC TTTGTAAAAA TCCAGGGTCA GCGTTTTGTC CTTATTGGCC 
146501 AAGAAAAGGG ATGCGATACA GCGTCACGCC TTCATAGGAA CTTCGGTATG 
146551 TTATGTCCCG AGGGTTTCAG AAAAGCCCTT CGCTTAGGAA AACTCGCTGA 
146601 AAAGTTTGGC TTGCCTGTGG TCTTTCTTGT CGATACCCCA GGAGCATATC 
146651 CTGGATTGAC TGCTGAAGAG AGAGGACAAG GATGGGCAAT TGCCAAAAAT 
1467 01 CTTTTTGAGC TCTCAAGACT TGCCACTCCC GTGATTATTG TCGTTATCGG 
146751 TGAGGGATGT TCAGGTGGAG CTTTGGGCAT GGCTGTAGGT GATTCTGTAG 
146801 CTATGTTAGA GCATTCCTAT TATTCTGTAA TTTCCCCAGA AGGATGCGCC 
146851 TCCATTCTTT GGAAAGATCC TAAGAAAAAT AGCGAAGCAG CTTCCATGTT 
146901 GAAAATGCAT GG AG AAAACT TAAAACAATT TGGCATTATC GATACTGTTA 
146951 TCAAAGAGCC CATTGGGGGA GCTCACCACG ATCCTGCATT GGTATATAGC 
147001 AATGTTCGAG AGTTTATCAT CCAAGAGTGG TTACGATTAA AAGATCTAGC 
147 051 TATAGAAGAG CTGTTGGAGA AACGGTACGA AAAATTTCGC TCTATAGGTC 
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147101 TTTATGAAAC TACTTCTGAA 

147151 TATATTAGGC TGTTCTCTAC 

147201 AGATGGAGAT TTTTTCTTTA 

147251 TTTTTACTTT TTGGACGTAA 

1473 01 ACTAAGTCAG AAAGATATTT 

1473 51 CAGAGACACT TACAGTCTCT 

1474 01 AAAAGCACAG CCTCTCTGAC 
147451 CATCGATGTG AGCCGCTTTC 
147501 CTATTTTTAA AGCAGTCACC 
147551 GTTGCTATAC GGGTAAGCCG 
147601 ACAACAACTC CCCATGACCT 

1476 51 GTAATCGTGT CATGACAGAT 

1477 01 TTAATGATTA ACTACATTCA 
1477 51 AGTCTGTCTG TCGATTTCAT 
147801 TTCCTATCTT TATCCTTCCC 
147851 TTAGCAAAAC GTATTCAAAA 
147901 TGATTTTCTT GCTGGGGTTA 
147951 TTGCCTTCAC AAAATATTGT 
148001 GAGAAAAGTG CTGCTTACGG 
148051 AGCTTCTTTA TTTTTTGCTT 
148101 CTATTCCTCC CGAAGAACTT 
148151 TACGACCCTA TTAAGAAGTT 
148201 ATGTGCTGCT *GCGGAGAGAT 
148251 ATAGTCAAAA AGAAAGAGAA 
148301 ACATTCGAGA ATGTTTCCTT 
148351 AAATCTAAGC TTTACCTTAC 
148401 CTACAGGATC TGGAAAAACA 



AGCGGTCCTG AGGCATAAAA ATCATCTCGT 
TCGCAATTTT AGGACTTACC TTTTCATCTC 
GGGATGATTG CTAAAACAGG CCCCGACGCC 
GGAATCTGGA AAACTTGTAA AGGTTTCAGA 
TAGAGAATTG GCAGGCAATT AGTAAGGATT 
GATGCCACGA CATACATCGC CGAACATGGG 
GAGCAAGCTC TCTAAGTTTG TCCGTAACTA 
GAGGACTGGC AATCTTCTTA ATCTGCGTTG 
TTATTTTTCC AACGTTTCCT TGGGCAAGTC 
AGACTTACGT CAGGACTACT TTAAGGCCCT 
TCTTCCATGA TCATGATATC GGTAATTTAA 
TCTGCAAGCA TTGCCTTAGC AGTAAACTCT 
AGCCCCAATT ACCTTCATAT TGACATTGGG 
GGAAGTTTTC AATTCTTATT TGTGTTGCCT 
ATTGTCGTGA TCGCTAGAAA GATCAAAAAT 
GAGTCAGGAT TCATTTTCCT CCGTTCTTTA 
TGACAGTAAA AGTCTTTCGT ACAGAAAAAT 
GAGCATAACA ATAAGATTTC TGCTTTAGAG 
TTTGCTTCCA CGACCCCTCC TGCATACCAT 
TTGTCGTCGT TATCGGAATT TATAAATTTG 
ATCGTATTTT GTGGTTTGCT CTACCTAATC 
CGGGGATGAA AATACCTCCA TCATGAGGGG 
TTTATGAAGT CTTGAATCAC CCCGATCTTC 
ATCGAGTTCC TTGGACTTTC TAATACAATC 
CGGCTATCAG GAAGATAAGC ACATCCTCAA 
ATAAAGGCGA AGCTCTAGGC ATTGTAGGAC 
ACACTTGTTA AATTACTTCC TAGGCTCTAC 
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1484 51 GAAGTCTCCC AAGGAAAGAT TCTTATCGAC TCTCTTCCTA TTACGGAATA 
148501 TAACAAAGGG TCCTTAAGGA ATCACATCGC CTGTGTATTA CAGAATCCTT 
148551 TCTTATTCTA TGATACTGTA TGGAATAACC TTACCTGTGG TAAGGATATG 
148601 GAGGAGGAGG CTGTTTTAGA AGCTCTAAAA CGTGCCTACG CTGATGAGTT 
148651 TATTTTAAAG CTCCCTAAAG GAGTCCATAG CGTGCTCGAA GAATCTGGGA 
1487 01 AGAATCTCTC AGGAGGACAG CAGCAACGTT TGGCAATAGC ACGTGCTCTG 
1487 51 TTGAAAAACG CCTCCATCTT AATTTTAGAT GAGGCAACGT CAGCTCTAGA 
14 8801 TGCCATTAGT GAAAATTACA TTAAGAATAT CATTGGAGAG CTTAAAGGAC 
14 8851 AGTGCACACA AATCATTATT GCCCACAAGC TGACCACTCT TGAACATGTA 
14 8901 GATCGCGTGC TCTACATAGA AAATGGTCAA AAAATTGCCG AAGGCACAAA 
148951 AGAAGAACTC TTACAGACGT GTCCTGAATT TTTAAAAATG TGGGAGCTCT 
14 9 001 CAGGGACTAA AGAATATAAC AGGGTCTTTG TTCCTGATCA CAAATTAGTC 
149051 GCAAATCCTA CGGACATGGC AATAACAACT TAGGTGGGAT CGCTCTCTCC 
14 9101 ATGAGCTCAG GCAACAACTC TACAAGTGTC TGAGTTAGCT TTTGTGATAC 
149151 CTCCTCCAAT CTGCTGAAGG GACAGTCTCC TGGAACAGTA TAATCAGAAG 
1492 01 TGATCTTGAG AAAAGAACAG GGGATGTGAT GTTCTGCTGC TTGTGAGGCT 
149251 ATAGCATAGC CTTCCATATC TAGAAGTTTA AACGTCTTAT GAAACCCATA 
149301 ATGGTACAAT ACTGGAGAGG TAACCAGAGA GCTTTTAGGT AGAGAATCCG 
149351 GTAGAGCGTC AAAGATATAA GGGGGATCTT CAGAGAGAAC AGGAGGTGTA 
149401 TCCGTAGTGA GGTTTGCAAT TTTCTCAATA GTGTAACATT GACCTAAAGG 
149451 AATCTCGGGA GAACATGCCC CCACAAAACC TGGATTGATC CACAGATCGT 
149501 AATCTGTATA TGCTTGGCAA TAGCTTTGAA GAGCATTTAA AACGGCTGTA 
149551 CTTCCCCAAA CATGGACAAT ATAGAGATCT AGATGGTAGT CAGTACAACG 
149601 ATAACTATAG AGATGCTCGT TGATCTGTGT AAAATCAAGT TGTTCAATTA 
149651 GAGGAGAAAT TTCTCTATAG TCTGCAACAA TGCAAAGGAT TTTTTTAGGT 
1497 01 GTATTGACAG CATTCATTGG CCTTCCAGAG CATATGTAAA GCTTTTTTCC 
1497 51 CAGTTTTAGA TAGTTGAAAG GTTTCTTTGT TGATATAGGT TCCTATGAAT 
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149801 CTATGAATCA CGGTCACGTT TTTATTTTTA GAGTATTCTA CTGCTTTTGC 

149851 TCCCGCAGTT ATAGGATCTT TCAGGGAGCA AATTAAAGAC TTTCTTAATG 

149901 CTGCTGTTAG AGCATCCACT GTAGCCATAG GAACATATTT CGCAATGGCT 

149951 AAACATCCTA AAGGAAGGGG AAAGATGGTC TTACGGCGCC ATAGCTCTCC 

150 0 01 AAAGTCTGCC CGCAATGTCA ATTGGAGATC GTAGCTGAAG CGCTCTTCAT 

150051 GAATCAGAGC GCCTCCATCG ACTTTCCCTT GCAGTATCGC GGATAGAATT 

150101 TTGTCATAAG GCATGGGAAT GAGTTTTGCC TTGGGATAGT AAAGTTTACA 

150151 GAGAGCATGA GCGGTTGTCA TCTCTCCAGG AGTTGCCAAG GTATCTAGAG 

150201 AACATTCAGG ATCTAAGGAG AGGACGATAG GACCGCTGTT GTATCCTAAG 

150251 GTATTTCCTA CGTCCATAAG ATTATAATAA TCAGAAACTA GAGGGAAGAG 

150301 CGCTGCTGAC ATTTTCATTA GGGAGAGCCG TCGCTGCAGA GCTAGGGTAT 

1503 51 TCAAAGTTTC AATATCCGCA ATTGTTACCT GGTTAAGAAG AGGCCTGAAT 

150401 TGGGGGTCTT TTAAGAAAGA ACGAAAAAGG AAAATATCAT TCGGGCAAGG 

150451 AGAAAAGGCA GCAGTCAGTA TCATGTCGGT TGATGTAATA GAGCTATAGC 

150501 GGCTTTGATG TCTTTATTTT CAGGCTTATC CAAAGCTCCT TGGTTTTCCA 

150551 GCCATTCGAA GTAAGACTTA GGAATATCCA CAAGAGGCTG CCCTTTGTAT 

150601 TTGCCAAAAG GCATTTTGAA GACTTTCGGG TGATAGCTCT GTTGCAGCAA 

150651 GTCGAGGACT TGCTGGGGCG GTAAATCACC GATTAAAGAA GTAAATACCT 

150701 TGTGCAATAT CACTACGTCA TCTAGAGCTC GGTGTGCTTG ATTTTCAGCA 

150751 AAACCGTAAA CTTGTCTTAG GTATTGTAAA TTATGTTTTG GTAGATCGGG 

150801 GCGATATTTT TGTGCCCATT TTAGAGAGTC TATTGTACGG TTTGTCAGAG 

150851 GCTCTAAGGA ATGTCTGCGA CATTCCTTAC CGAGTAGGGG GAAATCAAAA 

150901 CCGTCATTAT TATGAGCCAC TAAGATGCTG TCCTCTCCGC AAAATTTCCT 

150951 AAATCCCTCG TAGGCTTCAG GAAATTTGGG AGCAGAAAGT ACCGCATCCG 

151001 TAGTGATTCC ATGAATTTTG GATGCCTCAT CAGGAATGGG AATTTCCGGA 

151051 TTCACATAAG TAAGAAAGGA CTCATCTGTG ACACTATTGT AGGCAGCAAT 

151101 TTCTATAATG CGATCTCTTT CTATTTGTGT TCCTGTGGTC TCCGTATCAT 
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151151 
151201 
151251 
151301 
151351 
151401 
151451 
151501 
151551 
151601 
151651 
151701 
151751 
151801 
151851 
151901 
151951 
152001 
152051 
152101 
152151 
152201 
152251 
152301 
152351 
152401 
152451 



AGAAAATAAG 
CTCTTGAAGA 
AGATCAACCC 
TCGATATATA 
CAACACAAGC 
CGGGAATATC 
GCCTTATGAG 
TATTTCTCCC 
TAGCTTGAAT 
TCTTCATCCG 
CCATGTCGTC 
TTGCTGGCTG 
TGTATCTTTT 
AGACTGCTTA 
GAAACTCTGT 
GTTTTATTCA 
ACGTCGTTCT 
TTGATCTCAC 
AAAAGTAGAA 
TCCTGGGAAC 
GAGCAAGCTC 
ATGTGTAAGA 
TGTATCCAAC 
ATATAAATTA 
GTATATTCCT 
ATTCACAGAC 
CAAGCTCTTG 



AACATCCATA 
GCCTGACGTC 
AATAGCATGA 
TCCGTAATAG 
CGATAAAATA 
AATGTCCCTA 
ATTCATGCAA 
TCACTGCGGA 
GTTAATGTAA 
AACTTTTAAG 
ATTTAGGAAA 
ATAATACTAG 
GTATTGCTTT 
TTATTTTCTT 
CAGTAGCGTA 
CAACTCGAAT 
AGGATTTTGA 
AAACTTTTTA 
TCAAAGATCC 
ATTTCATGTA 
ATCAGAAAGC 
CCCGCGATCA 
TACATACAAG 
TAGTACAATG 
GCAAGTACCG 
TCCTTTATAC 
TCTTTCATAG 



GTTTGACTAC 
TTAGTTCATC 
GAAAAACTAT 
TGTGTCATGA 
TCAATCCCTG 
TTTAAGAGAT 
TTGGTAGGGA 
TTACAAAAAA 
GTAAGACCGC 
AAATTTTGTG 
TACTCCAAGT 
ATCTAATCTT 
ACTGAGGAAA 
GAAGTAGATG 
ATTTTTTCTG 
GAGGTAAATA 
TTTGAGACGA 
AACGCCCAAA 
TAGCATTTTA 
CAGAAATGGG 
TTAAAAACTA 
TCTCTATAAA 
TAATAATGAA 
ATTTCCAAGT 
TCTAGAGAGC 
CCTTCTAGGG 
ACAAAACGAC 
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TCATTACGTT 
CAAATTCATA 
CACAGACTAG 
ATTTCTCCGT 
TTCTTCATCC 
GGAGTAAACG 
AGGTAACAAA 
TAAAGGGAGT 
TTTCTCTTTC 
AGATTATTTT 
TGTTCCTAGA 
GATTCGTCTG 
TCAGAGGTTG 
ATTGATATCA 
GTGTTTCTGC 
GAAGTTTTCG 
GCCTCCAAAA 
CTCCAAAGCC 
AACATTTCTA 
CTCTTGGATC 
AAGAAAAAAG 
ATTATAGTGG 
GTATAGTTTT 
ACAGATTAAA 
TCCCCTAGAT 
TGTGCTCGTT 
AGCAGTCTGT 



TTTCTTGTTG 
TTCCCAGAAG 
CTTTATTGTA 
TTAGGCAGGG 
ATGCCAAAGC 
AGCTGTTGAT 
TCAACTGCAG 
TCATTGCCAT 
TAAGAAAGGT 
GATTTAATGT 
GCCTGCATCA 
TTGTTTTTTT 
CAGAGGAAGA 
GTGTCGGGAG 
AATCACAAGA 
GCGTTAGGGA 
CCGTGACTTC 
AAAAATTGTT 
ATTTCATGCT 
GTTTCTGCAA 
ATTAAAAAAC 
TAGCCCGATT 
AGTCGATGCT 
CCGTAATCAT 
GATTTGGGTA 
CCACAGAGCC 
CGGTGGATGC 



152501 


AAAGAATTTT 


TTCAAGTTGC 


CTGAGAAATT 


CCTTGGCAGC 


GCTTAATAAA 


152551 


ACTATGGTGA 


TGCTATGGAA 


AAGTTACTAG 


TGACTGATAT 


TGACGGTACA 


152601 


ATTACCCATC 


AATCTCATCA 


TTTAGATAAA 


AAGGTGTATG 


AGCGGCTCTA 


152651 


TGCGCTGCAC 


CAAGCTGGTT 


GGAAGTTGTT 


TTTCTTGACG 


GGAAGGTATT 


152701 


ATAAATATGC 


TGCACGCTTG 


TTTTCTGATT 


TTGATGCTCC 


ATATTTATTA 


152751 


GGATGCCAAA 


ACGGCGCTTC 


TGTATGGTCT 


TCAACATCAT 


CAAATCTTCT 


152801 


CTATTCTAAA 


AGTTTACCCT 


CAGATTTATT 


ATGTATTTTA 


CAAGATTGTA 


152851 


TGGAGGGGGC 


AACGGCTCTT 


TTTTCCGTGG 


AATCAGGAGC 


TCCTTACGGG 


152901 


GATCACTACT 


ATCGCTTTTC 


ACCGACTCCT 


ATAGCTCAAG 


ATTTACACGA 


152951 


ATATGTAGAT 


CCTAGGTACT 


TTCCTAATGC 


TAAGGAAAGA 


GAGATCCTAT 


153001 


TTGAAACGCG 


CTCTTTAAAA 


GACGACTATG 


CTTTTCCTAG 


TTTTGCTGCA 


153051 


GCAAAAGTCT 


TTGGACTGCG 


AGATGAGGTC 


ATCAGAATTC 


AAAAGGAGCT 


153101 


GGAACGCCAA 


GAAGCACTGA 


CTTCAGTCGC 


GACGATGACG 


TTAATGCGCT 


153151 


GGCCCTTTGA 


CTTTCGCTAT 


GCCATCTTGT 


TTTTAACAGA 


TAAAAGCGTC 


153201 


TCTAAAGGCA 


AAGCCTTAGA 


TCGTGTTGTC 


AATATACTTT 


ATGATGGAAA 


153251 


GAAACCCTTT 


GTCATGGCTT 


CAGGAGATGA 


TGCTAATGAT 


CTCGATCTTA 


153301 


TTGAGAGAGG 


AGATTTTAAA 


ATTGTGATGA 


GTTCCGCACC 


TGAAGAGATG 


153351 


CACGTTCATG 


CGGACTTTCT 


AGCTCCCCCA 


GCAGATAAGA 


ATGGCATTCT 


153401 


TTCAGCTTGG 


GAAGCTGGTG 


TCCGCTATTA 


TGACGACCTT 


ATGAGTCTTT 


153451 


AGGGAACATC 


TCAGGACCAA 


TTCCCATCAC 


ATTGGCTCCG 


TGATCTACGT 


153501 


ATAAGGTCTC 


ACCAGTAATT 


GCTGAAGCTA 


GAGGTGATGC 


TAAGAAAGCT 


153551 


GCAACGGCAC 


CCACCTGCTC 


GGCATTCATA 


GCCTCGGGAA 


TAGGCGCCCA 


153601 


CTCTTGGTAA 


TAGTCTACCA 


TTCTTTCAAT 


AAAACCAATT 


GCTTTTCCAG 


1 J J U jl 


1 ULAju 1 i vaL. 


1 AAAbb J.CC1 


GCAGAGATGG 


TATTGACACG 


TATGCCCCAA 


153701 


CGGCGTCCCG 


CTTCCCAAGC 


AAGAGTTTTG 


GTGTCACTTT 


CCAAAGCTGC 


153751 


TTTTGCCGAA 


CTCATGCCCC 


CTCCGTATCC 


AGGAACAGCG 


CGCATAGAAG 


153801 


CCAAATAGGT 


GAGCGATATT 


GTCGATCCAC 


CACGGTTCAT 


GATACTTCCA 



153851 AAGTGAGAGA GAAGGCTAAC AAAAGAATAA CTAGAGGCAC TGAGAGCCGC 

153 901 TAAGTAACCT TTTCTTGATG TTTCTAATAG AGACTTAGAA ATTTCAGGAC 

153951 TATTTGCCAG CGAGTGGACA AGAATGTCAA TATGACCAAA ATCTTTTTTT 

154001 ACCTGTTCTG CGACTTCTGA TATCGTGAAT CCCGTAATGC CCTTGTAACG 

154051 TTTATTTTCA GCAATATCTT CAGGAACATC TTCAGGGCTA TCAAAACTTG 

154101 CGTCCATGGG ATAGATCTTA GCAATCTCTA AGAGAGTGCC ATTCGATAAT 

154151 TTTCTAGATT CATTGAATTT TCCTAATTCC CAAGACTGAG AGAAAATTTT 

1542 01 GTAAATCGGT ACCCATGTTC CTACAATAAT CGTAGCTCCT GCTTCTGCAA 
154251 GAAGTTTAGC AATACCCCAG CCATATCCTT GGTCATCACC AATGCCCGCA 

1543 01 ACAAATGCTA CCTTTCCTGT TAGATCAATC TTTAGCATGA ATCCGCCTTA 
154351 TACTTTTGAA GCTTATTGGA AGGAGAGTAA CAAATCTTTC GATTATTAAG 

1544 01 AAAACCTTTT GGTGCCTCAA CAGGGGAGAT CCTGCCTCCA ATGTAAATAG 
154451 AAACGTAAAT TCTTTAAATT TTTTTCTTTA CATATTTTAT AGAATATCCA 
154501 AACTTCTCAC TCCCGCGTAC TGCTAAAAAA ATTTTCAAAA GAATTTACGA 
154551 TCCGAACTTA TCGTAGTTTG GGTTTCACTG ATTACTTAGG AGGTTGTTTG 
154601 ACGAATCCTT TAGGGAAATT CCCCTCACCA CAGAATCCAC AGGTTGTTAC 
154651 GATAGCGCCT TCTTCCACAA CACCACAAGC AGTCTCATCT GCAGTTCAAG 
1547 01 GTTTTCTTCA AACTGGAGGA GCTGCCTCCT CTACAGCGAC AACTACTACC 
1547 51 GCATCCGGAG CCTCTGCATT AGGACTTTCA CCTGATCAAG TGCAAGCGTT 
154801 GCTTACTAAT TTATTAAATG TGGGACAACC ATCAGTGGGA CAACCATCAA 
154851 CTTCAGCAGG AACTTCGGGA GCCTCCTCTT CCAGTGCAAG TATGCAGCAA 
154901 CAGCTTTTGC AACTTATCTT AGACAAGACA ACAGGAAGTG GCGGATCGTC 
154951 CGTGAGTTCA GAGCAATTAC AGCAACTCCT TAGCTTGGTG AGCCAGATGA 
1550 01 CTACGTCTCA AGGAGGAAGT GGTGGAACTC AGGCAGGACA GGCCGCTTCG 
1550 51 GTACTGTTGA ATTTGTTATC GGCAACAGGA TCTGCAGCAG CAAATCCTTT 
155101 AGGGACAGCT GCATCGTTGG CACAGATCAT TTATGCAGCA GTAACAAGTC 
155151 CTGGAGCAAA GAAAACTAGC GAATTTTGTT ATAATTATTG TGGAGAGACC 
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156551 GCACAAATTT TTAGGGGAAG ATTTGTGAAT GGGGAGATAA ATTCTAGGGA 

156601 GTGAGCATGG AGGAGAGGGC GGAAGATCTG GGGAGGCTGT TCTTTAGGTC 

15 6651 CGTAGTCGAC ATCTCCGACA ATAGGATGAC CCAGCAATCC CATTTGTAAG 

156701 CGGATTTGAT GGGTTCTCCC TGTGATGGGC CTGCGGCTCC AAAAATCACA 

156751 GCTCCACACC TCCGGTATAC GGGGGCCGTA TAAGATTTTA CGGTTCCAAA 

156801 TTTTTTTTTA GGATGACCAA AAACGAAAGC TATGTATTGT TTATGGATTT 

156851 TTCTTTGCTT GAACAATTTC ATGAGCTCAG TAGCCGCTTG TTTAGACTTT 

156901 CCCATGAGAA GACACCCAGA GGTGCCTTTG TCTAACCTAT GCACAGTAAA 

156951 AAACCGTGTC ATGTGTGCCA TTTGTTCAGT AGTAAGATGG GGAGGTTTTT 

157 001 CGTAGATAAT GCTATAGTCA TCCTCCCAGA GGATGCTAGG TTGTTGTTTT 

157 051 GTTGAGGGGA TCAGAGATAG GGAAACACGG TCGCCAGGTT GTACCTTGTA 

157101 GGATTCAAAT CTTTCTATGA ACCCGTTCAC TCGACATCGA TGTTGGCGAA 

157151 TAGACGCCAA GATTTCTTGC TTGCTATGAT TAGGCAGTTG AGATCTAAGA 

157201 AAAGAAGATA ATCTTGAGAC TTGTGTGGCA AGCCAGGAAA AATTTTCCAT 

157251 AAAATATTGT AAAGCAGCCC TTTTATCATT TGATAATTGC ATAAAATTTT 

1573 01 AAGAGATTTT GTATGACAAA GATAGCTTTT TCTGAAAAGG CAAAGAATTT 

1573 51 TCCTGTAGAG GCATTAAAAA AATGGTTTGA AAAAAATAAA CGATCTCTTC 

1574 01 CTTGGAGAGA TAACCCGACT CCCTATAGTG TGTGGGTTTC CGAAGTTATG 
157451 CTACAGCAAA CGCGAGCTGA AGTTGTTATA GATTATTTTA ATCAGTGGAT 
157501 GGAGAGATTT CCTACCATAG AGTCTTTAGC TGCAGCAAAA GAAGAAGATG 
157551 TCATTAAGTT ATGGGAGGGA TTGGGTTATT ATTCTCGAGC GCGCCATCTT 
157 601 TTAGAGGGAG CTCGCATGGT TATGGAGGAG TTTCATGGAA AGATCCCTGA 
157 651 TGATGCCATT TCCTTAGCTC AAATTCGTGG AGTTGGTCCT TATACGGTTC 
1577 01 ATGCTATTCT AGCCTTTGCT TTTAAGAGGC GTGCTGCTGC TGTGGATGGC 
157751 AATGTCTTGC GTGTTCTTAG CCGGATATTT TTGATAGAAA CTTCTATAGA 
157801 CTTAGAATCA ACTCGTACTT GGGTTTCTAG GATTGCTCAA GCGCTTCTTC 
157851 CTCATAAGAG TCCCGAGGTT ATAGCTGAGG CTCTGATAGA GTTGGGAGCT 
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157901 TGTATCTGTA AAAAAGTTCC TCAATGTCAT CGTTGTCCTG TCCGTCAAGC 

157951 ATGTGGAGCT TGGAGGGAGA ACAAACAGTT CGTATTGCCG GTACGTCATG 

158001 CCAGAAAAAA GGTCATCTTT TTGCATCGTT TGGTAGCGAT TGTATTGTAC 

158051 GATGGCTCTT TGGTTGTCGA GAAGAGACGT CCTAAAGAAA TGATGGCAGG 

158101 CTTATATGAA TTTCCTTATA TTGAAGTTGA ACCAGAGGAA GGTCTTCAAG 

158151 ATATAGAAGG ATTTACTAAG AAGATGGAGC TTTCTTTAGA AAGCCCTTTG 

158201 GAATTCTTAG GTAACCTTAA AGAACAGCGG CATGCGTTTA CTAATCATAA 

158251 GGTTCATTTG TGTCCTATAA TTTTTAAAGC CACTTCTCTG CCTCAGTTCG 

1583 01 GGGAATTGCA TCTTTTGAGT GATATAGATC ACTTAGCTTT TTCTTCAGGA 

158351 CACAAAAAGA TTAAAGATGC TTTGCTAATC TACCTCGGGG ATGTCAGGTC 

158401 TAGAGAATCA ATAGGAGTAT AGATGCGAGA TCACGCTTTT TCTAAATTGA 

158451 TAGGGACTGT CCGTGCCATG GTAGTTGAAG GACGTTGTCC TTGGTCACTT 

158501 CAGCAATCCC TAGTCTCTAT GGTAGAGCAT ATTCTTGGAG AGTGTCAGGA 

158551 ATTTCACGAG GCCGTCTTAC AAGGTAAGAC GGTACAAGAG GTTGGTTCCG 

158601 AAGCCGGGGA TGTCTTAACT TTAGTTCTAA TTTTATGTTT TCTGTTAGAA 

158651 CGAGAGGGCG TACTTGCTTC CGAAGACGTT GCCAATGAGG CTATGGAAAA 

158701 ATTGCGTCGC CGTGCTCCTT ATATATTCGC TGAAGATTAC AAGCCGGTCT 

158751 CGATTGAAGA GGCCGATCGC CTTTGGGAGC TTGCTAAGCA CCGAGAGAAA 

158801 AATGAATCTA CATAGTTGAA GTTTTGGTCT ATTTTTAAGC ATATGGTGCT 

158851 TTTGAAAAAA CAGAATATAT GCTATCAAAG AAGGGTAAGT TGGGGGCCTT 

158901 TTAAGAGAAG GAACCTGCGA ATCGGGTCAG GACTGGAAGG TAGCAGCCCT 

158951 AAGGAGAGTT TTCTTTTGCT AAAAGAATGT TCTCCAACTT ACTCTTTTTA 

159001 CTTTATTCCC AAAAATAGCA ATGAGGTGAG GTTAAACAAC CCGTGCAGTG 

159051 CAATGGGAGA AAGAATGTGC CG ATCTTTTT CATATAGAAA CCCTGCAGAT 

159101 AAGGAAAAAA CAAAGAGCAC GGGGACAAAG ACGCAACTTC CTAAAGAGTG 

159151 TTCAATGTGA ATGAAAGAGA AAATAATAGA AGAGCATAGT ACCGCAGCTA 

159201 TGCGCGTCAT TTTGTTTTTC AAGAATGTCT GTAGAATTCC TCTAAAAAAT 
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159251 ACCTCTTCTC CAAATGGAGT GAGGACGCCT AAATTTAGAA TCATGCTAAT 

159301 GTAGTGTCCT GTTATAGGCA GAGAGTTCTG AACTTCTTGA GTGACTTCTT 

159351 GTGTGTGAAT CTCTTGCGTA GGAAGAACCA AAGTTAAAAA TTTACTCATC 

1594 01 ATAATCCCAA TCAGTTGTGT TACTGGGATG ATGATGATCC ACATTCTGAT 

159451 GGCAGATCCT AGAGCACGCC ATGAAGTTTT AACCGGTCTT TCTCCAGAGA 

159501 AAAGTATAGC ACGTGTGATA TCCTTGGGGA GAAAAAGCAG GTAGAACAGA 

159551 AATGCAAAGG CAAGGCTAAT TCCTGTCATG GTGGAAAGTA ATTCCGCAGT 

159601 TTGTGAGCTC ACGCTAAGAG CTACAAGGGA AGAAAAAACA AGAAGAGCAC 

159651 CACCAAATAA AACTTGGCGG AGCTTTAAAG GTGTTTTCCC AGAGGGTGCT 

1597 01 GGCCAGATAA AGAAGTTTTT GGAAGCTAGA GCAGCGACGC CAAGGGACAA 

159751 GAGAAGAATA AACTTGGACA TTTCCTTAGA CTACGAGTAG TTAGCACAAA 

159 801 CATAGCCCTC AACTCTGGCA ACAACTTCGC GGAAAAGACG GCTATGCATT 

159 851 AAGCCCATGG GCGTTGAATC AAAATGTTTT GAGTTCCAGC CATAGCGATG 

159901 ATAATCATTG ACTAGGGTAG TTAAAGGCTG GCTGCATTCG ATAATCTCTT 

159951 GATAAATGAG AGCTATTTTA TGATGACGGA TATCAAAAAC GCGAACACGT 

160001 ACAGACGCTG TTACAGAATC GACACCTGCT TCTTTCCCTG TCTTTTGTTC 

160051 TAACAGTTCT GTAGCAACAA TGAATTCTGC AGGAAGAAAT TGCTCAATAA 

160101 TTGTTTCGGG TAGACGATTC GCAATCGGAG CATAGAACTG AGAGACTGTC 

160151 TGAGGTGAAG CATTGTGCTT GATCAGGAAG ACCTTTTCCG AAGCATAAAA 

160201 CCTTTTGCTG ATCTCTTCAG TAAATTCTCC TTGGAGGTTC CAAGGTAAAG 

160251 GTTCAAGACT CTTTCCTGGG CGATGAAATA CAGGAAGCAT CGCAATCACA 

160301 CCTTTAGTTT TGCTCCCTGA AGTGTATAGC TTAGGATGAT AACTTCCTGA 

160351 AGAGCCTAAG TGAGTGCAGC TGGATAGGGT TGGGGATAGA AGTCCTAAAG 

160401 ATGCCAATAA TACCAACATT TTTCGCATAG TCACTGTCCT TAAATTGCTT 

160451 ATTTTGCAAA AGATTCTAGC CCTGGGAAAG TTTTTACTTT TAAGATCAAT 

160501 ACTTTCGCAA TTGAGAGATT TTCCATTTAA AACTCTCATT AGCTTATATC 

160551 AAAGAAAAAA ATAAAAACAA GCAAAGAGAC CGTCTCAGTT TTAGTTTAGA 
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160601 AACTCAAGGT TGAGAAAGGG ATTCTGACCA AAGTTGTGAG GGAACTTTGG 

160651 TAACTTTTTC TTTAGGAATC AATGTGCACC CTGGGAGGAA TACAACTGGA 

1607 01 GCTGAGATCA CAGAAAAAAT AAACCACTTC ATGACTACCT CTGCAAAAAG 

1607 51 AATACTACTA TTTTCTATTT GCATAGGCAG TTCTTCGATT TATAGCAATT 

160801 TTTACTTTAT CATCATAAAA ACTATGATGA AAAGGTTCTT AGTGAACTTC 

160851 TAAGGAACAC CATGAGTTTG TGATTAAAAC TCATGGTGCG TAGTTGAACC 

160901 CTTATTGAGG AGGGACGCAA CCACTAAGAG CTAACAATAC TAAACAAGAT 

160951 AATAGTAAAG AGAATGCTTT TTTCATTATT CATCCTTAGG TTAATTGACC 

161001 TTTCCAGCCT AGCTCTAGGG CGATTATTTA TCAAATTTTT CTTTGTAATT 

161051 AATGATCATG CGACCATTAA TTTAGCGATA AATTATGATT TCGTCAGGAA 

161101 AATTCAATTC TTTATAATAA TGATATGAAA TTAGAGAATG TCTATAGGGG 

161151 CGGACTCTAT TTGTGATCCA GGATCTCTTT AGGAGCACTT TGTGGATTTT 

161201 GATTATTTTG GTCTGAGTGA TATTGGTAGG GTGCGCGCTA GAAATGAAGA 

161251 TTTTTGGCAG GTAAACCTCA TGTCTCAAGT GGTTGCTATT GCTGACGGTG 

1613 01 TTGGGGGGCG TCTTGGTGGA GACATTGCTT CTCAAGAGGC AGTGACTAGC 

1613 51 CTTATGGAGC TGATTGATGA GCAACAGTCA AAATTGATGG GGTATGGGGA 

161401 TGACCAGTAT AAGGAGACTT TAAAAAAGAT CCTTTTAGAG GTCAATGGTG 

161451 TGGTCTATGA ACACGGCCAA ATGGAAGAGC ATCTCCAGGG TATGGGAACC 

161501 ACTCTTAGCT TCATCCAATT CCGGAAGGAT AGGGCATGGC TATTTCATGT 

161551 GGGAGATAGT CGAATTTATC GTATTCGTGA GGGAGAACTG CGCCGCCTTA 

161601 CCGAAGACCA TTCTTTAGAA AATCAATTAA AAAATCGTTA TGGGCTTCCT 

161651 AAACAATCAG ATAAGGTGTA TTCTTATCGC CATATTCTGA CTAATGTTTT 

1617 01 GGGAAGTCGT CCCTATGTCA TGCCTGACAT TCGGAATCTT CCTTGTGAAA 

1617 51 AGGAAGATTT GTACTGCCTC TGTTCGGATG GATTGACAAA CATGGTTCCA 

161801 GATATCGATA TTCGTGATAT CTTGAACCAG CCCGCCACGC TAGAAGAACG 

161851 GGGGAATGCA TTAATTTCTC TAGCCAATAC TCGTGGAGGC GATGACAACG 

161901 CTACTGTCGT ATTAGTCCGA ATACAATAGT TCCTTTGCTA AGGATAGTAT 
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161951 TCCATGATCT ATTTGGATAA CAATGCGATG ACACCCCCAG AGAGGGGACT 

162001 TTTGGAATTT CTCCAAAAAA CCTTCCTTAT AGAAGGGACG TACGCGAATC 

162051 CTTCGAGCGT CCATCAATTA GGTAAAAAAT CTCGTCAACT GGTTCTAGAA 

162101 GCTTCACACT GGATGCAAAA GGTCCTTTCG TTTCAGGGCC GTGTCCTCTA 

162151 TACCTCAGGG GCTACTGAGA GTTTAAATTT AGCAATAGCA AGCCTCCCTA 

162201 AAGACAGTCA TGTTATCACC TCAGGTAGCG AACACCCCGC CATCTTAGAG 

162251 CCTTTAAAAC ATTCCTCGCT TTCCGTTTCT TATTTAAATC CCGAAGAAGG 

1623 01 GAGATGTGTT CTTACTATAG AGCAGATTGA AAGAGCTGTG ACTCCTAAAA 

1623 51 CTTCAGCAAT CATCTTAGGT TGGGTCAATA GTGAGACTGG TGCCAAAGCT 

162401 GATATAGCTG CTATAGCCCA CTTCGCGCAA GAACGACAAT TGCAATTTAT 

162451 TGTGGATGCG ACTGCAAATG TAGGTAAGGA GAGGATAGTT CTTCCCTCTG 

162 501 GTGTCACTAT GGCAGCATTC AGTGGACATA AATTTCATGC ACTCTCTGGA 

162 551 ATCGGAGCTC TTCTGGTCTC TCCAGGAGTC AAACTACATC CTCAGCTGTG 

162601 GGGAGGAGGT CAGCAAGGAG GGCTGCGCGC AGGCACAGAA AATCTTTGGG 

162 651 GAATCGCCTC TCTGCTTTAT ATTTTCAAAT ACCTAGATCT TCATCAAGAG 
162701 CGTATCTCTC AGGAAATTCT TACCCATAGA AATGGTTTTG AAAAGGCAAT 
1627 51 CAAAGCACGC ATTCCTGATG TCCATATTCA TTGTGCGGAT CAACCACGGG 
162801 CAAACAACGT CTCAGCAATT GCTTTCCCTC CGTTGGAAGG TGAGGTATTG 
162851 CAAATCGCCT TAG AT AT AG A AGGAGTGGCT TGTGGTTATG GATCCGCATG 
162901 CTCTTCAGGT GCTACCGCAC CCTTTAAATC TCTTGTCAGC ATGGGTGTTG 
162951 ATGAAGAGTT GACCCTGGCA ACACTCAGGT TTTCTTTTAG CCATCTTCTC 

163 001 TTGCAAGAAG ATGTTGAAAG AGCCGTTGGA ATTATAGAAA AAGTCGTAGA 
163 051 ACGTTTGAAA AATTCCTAAG TCTTAAAAGA GAACATGTTT CTAAGCTGAA 
163101 AGAACAGTCC TGACTCTTAT TGCAGAATCT ATGAGAGTAA GTTTTTAATC 
163151 GATACGGTTT TTATCCCAGA TAAAGACATC TCTTTAACTT CTAAAAGCAA 
163201 GTTGTTGATA ATTACAGAAG TTCCTACTTC TGCAGGACTG TCTAGCAGTT 
163251 GCAATACCAA TTGGGCTAGG GTTTCTACAG GATATTGCGG AAATTGAATA 
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1633 01 TCGAGTTCTT TTTGCAGATC TTTTATGCGA GAGTTGCCAG GAAACGTTCT 

1633 51 TTCAATAACA GAGATGGTCT TGGGTTTTAA ATGAGCAATG TTTGTAGTGT 

1634 01 TGAATAAGAT TTTGAAAATT GCATTTAAAC TAAGAATACC TATAGGTTCA 
163451 CCAGAAGCAT TGAGGACAAC AGCAACACTC GAACGGTTGT CTCGAAACTC 
163 501 TTTGAGGATA CGAATAAGTT TTGATTTTGC AGTGATAAAC CAAGGCGAGT 
163551 GTAGATTATT GATTAGGGGT TCATCAAGAG CTTTATTGAC AAAGTCTTTA 
163601 GGATGGGCAA TCCCAATAAC GTTTTTTCGG GCCTTGTGAT AGACAGGAAT 
163 651 AAAGTTGATA TCTGTATTTT TTATAGTCCG GCAAAAATCT TTAACATTTG 
1637 01 CAGAAGAAGG AAGCATGGTA ACCTGTTCTA AAGGTTGGCA TACCTGATCT 
1637 51 GCACAAGTCG CACTTAAAGA GAAAATATTT GTAGCAATTG TATTGAAATC 
163 801 TTGTTCTTCA TGGTGAGTCT CTAAAGCTTT TTGGAACTCG TCTCTACTTA 

163 851 ATGTAGAGTT CAATTTTTCT TTCCTAATAT TTAGAAGATA GTAAAGACCC 
163901 TCAGTGAGAC TTCCTATGAG CTGAATCAGA GGATAGAAAA TATAGTGGGA 
163951 ATAATAGAGA ATCGGTGCTC CCCAAAGTGC TAATTTTTCA GGAATCTTCC 
164001 GTGATATTGT TAGAGGTAGA AGTTCTGCAA AAATCACAAC TATAAAAATT 
164051 TGAGTGAAAG GAGCGTAATC TGGAGTGATT CCTAAAGCTC GATAGCAATT 
164101 TCTTGAGGAC TCAGACCCGA CTTGTAGAGC GATATTCACT CCTAACATCA 
164151 CCGTTCCAAA TAAACGATAG GGGCGGCGAA TCAGGAAATT AATGTAGCGA 
164201 GCTTTCTTAT GATCTTTAGT CAGATAGTAT TGCAATCGTA CACGGTTAAA 
164251 TGACACGCAG GCCATTTCCA TCATCGAATA GAATCCTTGT AAGACAATAC 
164301 AGATAATGTT GACTCCTATC CAAAAGAGAG CAGAATTAGT CATACAATTT 
164351 CCTTATATAC ACACGGCGAA TGCGATTCGG AGCAGCGTCT AATACCTGGA 
164401 AAAGCAAGTT ATTCCAAGAG AGTTTCATTC CTGTTGTCGG AATCGTTCCG 
164451 ATTTGCTCTA TTAACCAGCC TCCTATAGTC GCAATATTAT TGTTCGTCGG 

164 501 TAGGTTGATA TCGAAGATCT CACTAAACTC ACGGAGTTCT AAAGTTCCTG 
164551 AGGCAATAAT AACATCAGCT CCTGAGGTGG TATAGAGTAT TTTATTATCT 
164 601 CTCTGGTCTA CAATTTCTCC AGCAACAATT TCAAAGAGGT CTTCTTGAGT 
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164 651 GATCAATCCT TCAATAGATC CGTATTCATC AATGATCATC CCTAGGGTTT 

1647 01 CGTCTTCAGC TGCCATCTGA CATAAAGCCA TTTTTGCAGA GATGGTTTCT 

1647 51 GGC ATATAAT ACGGTTTTTT CAGCAAGGGG AGGAGATCAT CCGAAGATTG 

164801 CAGTGGCTTG TCATGTAAAA GAAGAGAGCG CGCTGTGCAA ATGCCCAGAA 

164851 GGTTTTGGAG GTTATCGTTA CATATAGGAA CTCGTGAGCA ATGCTGTTTA 

164901 GAAAATAAAA GATAGAGGTT CTCTAAAGGG GTTTGGATAT CATAAAATAA 

164951 AATATCCTGG CGTGGCTGCA TACGCTCTTT AACACTACAA TCACTAAGAG 

165001 AAAGATAACC ATAGAGTAAA CGGCTTTCTT CTTGATTGAC TACGCCGAAA 

165051 TCCTTACAAC TTTGCAATAC TTCCTTCAGC TCTTGGGGTT GGATGATATC 

165101 AATCTGTTGC TTCGATAAAA TCCATTGGAC CACATAATTA ATTCCTACGA 

165151 TACCCCAGTG GAGTAGGGGT TTGAAGATTT TAGTAACACA AAGAATAAGA 

165201 GGGGCTACGG AACTAGCAAT CTGTGTATTA AAAGGAAGAG CTACTGCTTT 

165251 AGGGAGAATC TCACCTAAGA TCAAAGTAAT TGCTAAAGGA AGACCTACAG 

1653 01 TAAACCACCA CGAAGCTGCA TCTCCAAATA GAATGGCAAA ACAGTTTTGA 
16 5351 ATAGCAATAT TCAGTCCGAT ATCACAAAAA ATTAAGGTGA TGAGCAGGTG 

1654 01 GTGGGGATGT AGAAGAAGGG TAGCTACTCG CTGCTGTTTC TTAGATTTAG 
165451 AGCGCTTATA GTGCGAGATC AAACTCGTAG GCAAAGAAAA CAAAGCAATT 
165501 TGAGATAACG AAATGAATCC CGAGCATAAA GTAAAACAGA TAATGAAGAA 
165551 CATTAACATG GTAGGAATCA TGGTCTCTTT TCAGTCCTTA TTTTCTGATT 
165601 GTTGCTTTGG GGAGACACAG AGTTTCTTAT AGCCTTTAGG AATGAGACCC 
165651 AGACGCTGGA TTAAAGCAGC TTCTATAGCA GCGGAGTCTT GCCAGTGTTG 
165701 CAGATGTAAT TGGAGTTGAC GCTGCTTTTC TTGAGCAGAA AGAATGTCTT 
165751 GGCATAAAGA AGAGACCTTG CTTTGTAAGC GTAGCTCTTC TGTACGTAAC 
165801 TCCTGGATAG CACGATCATA AACAAAGCCT CCAATTAAGA TGCTAAAGAT 
165851 CACCCACCAG GATTTGATCA TCACTTCTTC TAGTAATCTA AAACCCCAGT 
165901 TTTTTTTTCT TACTGAAACT TTAGACACAA GGTACGGTGA TGCCTTGTTG 
165951 CTTTTGATAC TTTCCTTTTC TGTCTGCATA AGAAACCTCG CAGGTCGTAC 
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166001 TAGACTCAAA GAATAAGACC TGGGCAATCC CTTCATTAGC GTAAATTTTC 

16 6051 GCTGGCAATG GCGTAGTGTT AGAAATTTCT ATAGTCACAT GCCCTTCCCA 

166101 TTCAGGCTCA AAAGGTGTGA CATTTACGAT AATTCCACAG CGTGCATATG 

166151 TAGACTTTCC TATACACATT GTTAAGACAT TTCTAGGAAT TCGGAAATAC 

166201 TCAACGCTAC GAGCTAGAGC AAAAGAATTT GGAGGAACAA TACAGACGTC 

166251 ATCAGTAATA GAGATGAAGA TATCCTCAGT AAAGCATTTT GGATCAACAA 

1663 01 CAGAGTTATA GACATTGGTG AACACTTTGA ATTCTCGAGA TAGGCGGAGG 

166351 TCGTAACCAT AACTCGATAG GCCGTAACTT ATAAGTTTTT CGCCTGTCTC 

166401 CTCATTTACG TTCACTTGGC CATTAACAAA GGGATGGATC ATATCGGCAT 

166451 TTAGGGCCAT CTCTCGTATC CACTTATCTT CTTTTATGCT CATTTAGAAA 

166501 CCTTAACAGT TTGAAATTGC TTTCTTAATG ATATTCTGTT TTTCAATTTA 

166551 CTGGTTTTTG GGGGGAACTT TTCTAAGTAT AAGATAGACT TTGATTATCT 

1666 01 CTTGAAAAGA CCAGTTGTAT AAACAAGAAA AGCCTATCCC AAAGGCTACA 

166651 ATTTTATCAC GAAACCTAGA GGTAATGTTA GATAATCCTA AGGGAAAAAG 

16 6701 GCAAACCTTA TTTTTAGGGA GAACTTCAGG TAGGTCTGCT CTTTACTCTT 

166751 ATAGTAGAAG AATCTTGGTT CTCTTGAATG CATTCATGCG AGGACCTTGA 

166801 TAAGAACTTC TTGGATTCAT AAAAAGATTA ACATCTCCTT ATTGATAAGC 

166851 TAGAGAATTT TTACTACCAA CTTCTCAGTG GAAAATGTTT TTAAAAATAG 

166901 TTCGCCATCT TTAATTTATC TGTTTTAAGA CAAAAGAAAT CTAGATCACC 

166951 ACAGGAAGTT TAAATCATAA AATGAAAATG ATGGAGAGGT TCTAGTGCTC 

167001 GTACTTTGGC CCTGCTCTCC TTGATAGAAA GAAGAGGTCC ATAGTGTACT 

167051 TCTATATAGT ATCTCGTGTA CTATGCCGAG TATAACCGAT CGGCGTTATC 

167101 GATGAGAGTT TCAAAAAAAT ATAAAATCCA CCTAAAGAAA AAGCGATAGA 

167151 GAAGGTTCGT ACATGACGCA TCAAGTAGCT GTCTTGCATC AGGATAAAAA 

1672 01 ATTTGATGTT TCGTTAAGAC CTAAAGGGTT AGAAGAATTT TATGGACAGC 

167251 ATCATTTAAA AGAACGCCTA GATCTATTTC TTTGCGCAGC ATTGCAACGA 

167301 GGAGAAGTTC CAGGACATTG CTTGTTTTTT GGACCCCCAG GCTTAGGGAA 
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167 3 51 AACCTCACTT GCTCACATCG TTGCCTACAC CGTGGGGAAA GGGCTGGTCT 

1674 01 TGGCATCAGG GCCTCAGTTA ATCAAACCCT CGGACCTGTT AGGACTTTTA 

167 4 51 ACTAGTTTGC AAGAAGGGGA CGTGTTTTTC ATCGATGAGA TCCATCGTAT 

167 501 GGGGAAAGTT GCTGAGGAAT ACCTGTATTC TGCAATGGAA GATTTCAAAG 

167 551 TCGATATTAC TATAGATTCA GGACCCGGAG CTCGCTCGGT CCGTGTCGAT 

167601 CTTGCTCCTT TCACTTTAGT GGGGGCAACG ACTCGATCAG GAATGCTAAG 

167651 CGAACCTTTA AGAGCACGCT TTGCTTTTAG TGCGAGACTT TCCTATTACT 

1677 01 CGGATCAAGA TCTAAAAGAG ATTTTAGTCC GCTCCTCACA TTTACTCGGA 

167751 ATCGAAGCTG ACAGCTCCGC ATTACTAGAA ATTGCTAAGA GATCCCGAGG 

167 801 GACGCCACGA CTGGCAAATC ATCTTCTACG TTGGGTCAGA GATTTTGCTC 

167 851 AGATCCGAGA AGGAAACTGT ATCAATGGGG ACGTAGCAGA AAAAGCTTTG 

167 901 GCTATGCTAT TAATAGATGA TTGGGGATTG AATGAAATTG ATATCAAACT 

167951 TCTCACTACA ATCATCGACT ACTACCAAGG TGGTCCCGTT GGAATTAAAA 

168001 CCTTATCGGT AGCTGTGGGA GAAGATATCA AAACTCTTGA AGATGTTTAT 

168051 GAACCGTTTT TAATTTTAAA AGGTTTTATC AAAAAGACTC CCAGAGGCAG 

168101 AATGGTAACA CAACTTGCTT ACGACCATTT AAAAAGACAT GCAAAGAACT 

168151 TATTGAGTTT AGGAGAAGGA CAGTGAAACT ATTGAAAAAC GTACTTTTAG 

168201 GTCTTTTCTT CAGTATGAGT ATCTCAGGAT TCTCAGAAGT AAAGGTATCC 

168251 GATACTTTTG TGAAGCAGGA TACTGTCGTT GAACCTAAAA TTCGTGTCCT 

168301 TTTATCTAAT GAAAGCACCA CAGCTCTCAT AGAAGCCAAA GGTCCTTATC 

168351 GCATTTATGG AGATAATGTC TTATTAGACA CAGCGATTCA AGGCCAGCGT 

168401 TGCGTGGTCC ACGCTCTATA CGAAGGGATC CGTTGGGGAG AATTTTATCC 

168451 CGGACTCCAG TGTTTAAAGA TCGAGCCTGT AGATGACACT GCTTCTCTTT 

168501 TTTTTAACGG GATTCAGTAT CAAGGTTCCC TATACGTTCA TCGTAAAGAC 

168551 AACCATTGCA TCATGGTTTC TAACGAAGTT ACAATCGAAG ATTATCTGAA 

168601 ATCTGTACTT TCTATAAAGT ACCTTGAAGA GC TAG AT AAA GAAGCTCTAT 

168651 CTGCTTGCAT CATTCTAGAA AGAACCGCTC TATACGAAAA GCTCCTTGCA 
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168701 AGAAATCCTC AAAACTTTTG GCATGTTAAA GCTGAAGAAG AAGGGTATGC 

168751 AGGATTTGGT GTGACCAAGC AGTTCTATGG TGTAGAAGAG GCTATAGACT 

168801 GGACAGCTCG TTTAGTTGTG GATAGCCCTC AAGGATTAAT TATAGATGCA 

168851 CAAGGGCTCT TGCAGTCCAA CGTAGATCGT CTTGCTATAG AAGGATTCAA 

1689 01 TGCACGTCAG ATTCTTGAGA AGTTCTACAA GGATGTGGAT TTTGTAGTTA 

168951 TAGAATCCTG GAATGAAGAA CTGGACGGAG AGATCAGGTA ACCTCTTTCG 

169001 CATGGCTGAT CGCAATTAGC GTGGTATGGG GCTGTAGCGA CACTGTCGGC 

169051 GTTGCTACAT TTTGAGGGAC AAACCCTTGC TGACTCTCGG CAACTATTTG 

169101 ATAAGGAAGA AAGTTGCTGG AGGCTTTAGG TAAGGTCGCA AGTTGGTCTT 

169151 GAGCTCCCAC GTGAAAAGCA ACATATACAT GCGCTTTTGG CGATTTTATT 

1692 01 TTAAATGCTA AGAAATTTCC AGGGCGCCAT GTCATGGGAT TTCCCATAGC 
169251 ATCTACCCAA CTGATTTCCT TATTGGAAAG AAAGCCTCGA TTAAAAAGTG 

1693 01 TTTTATATTT TTTTCGAAAC GCAATGAGAT CACAGAGAAA GTGCATCAGT 
169351 GTAGGCTTTG CGGTAAGCTG ATCCCAAAGG AAGTAATTCG CATTCGAATC 
169401 CAAAGCCCAA CGGTTGTTAT TGCCTTCCGC GGTATGGGCA TACTCATCTC 
169451 CTGATTGAAT CATCGGAATG CCTTGCGAGA CCATCAAAGT AAGGAAAAAA 
169501 TTTCGTAACT GTCTTTCACG AACTTCAAGA ATGCCAGGGT CTTCTGTTTT 
169551 CCCTTCCGTT CCGAAATTGT AGCTGTAGTT CGCATCTGTG CCGTCACGAT 
169601 TATCCTCTCC GTTAGCCTCA TTATGTTTGT GGTTATAAGT CACAGTGTCA 
169651 CATAACGTAA AACCATCATG GCAACTGACA TAGTTAATCG AATTTGTAGG 
169701 CGAGCCGTGA GGATAGATGT CTTGAGATCC TGAAATTCTA GAAGCAAAGG 
1697 51 TTCCTATGAG ATTTTGATCC CCATTAAGAA ATGCTTTCAC GTTATCACGA 
169801 TACGGGCCGT TCCATTCACT CCATCTTGGA GACAGTGTGG GGAAATAGCC 
169851 CACCTGATAC AAACCGCCAG CATCCCAAGG CTCAGCTATA ATCTTTGTGC 
1699 01 TCGCAAGTAA AGGATCAAAA GAAATCGCCT CTAAAACAGG AGCGAATTGT 
169951 AGGGGAGATC CCGAAGGACC ACGAGAAAAG ACAGAAGCAA GATCAAATCG 
17 0001 GAACCCATCG ACATGCATTT CTTCTACCCA ATAACGTAAG ATGTCGAGAA 
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170051 


TCCATTGGGT 


CGTGGGGGCG 


CGGTTTGTAT 


TGAGAGTGTT 


TCCACAGCCT 


170101 


GAATAATTTG 


TAAAGTGACC 


TTGTGCATCT 


AAAATATAAT 


AGCTCGGAGT 


170151 


GTCTATCCAA 


GGCAAAGAGC 


AGGTCGTCCC 


TTGCAAGCCC 


GTATGATTAA 


170201 


AAACAACATC 


AAGAATGACC 


TCAATACCTT 


CTTGATGCAA 


GGTCTTTACT 


170251 


AAAGTTTTAA 


ACTCTCTACT 


TGGAGCGCAA 


GGATCAGAGG 


CATAAGCATA 


170301 


ACGTCGGCAA 


GGAGAAAAGA 


AATTTAGGGG 


AGCATAACCC 


CAATAATTGC 


170351 


ACAGATAAGG 


GAATTTCGAA 


TTTCTAAAAG 


GATGCGCAGT 


CTCATCGAAC 


170401 


TCAAAGATAG 


GTAAGAGTTC 


AACAGCGTTG 


ATTCCCAGCT 


TATGCAGATG 


170451 


GTCGATCTTT 


TCAATGATTC 


CTAGGAAGGT 


TCCCGGAGCA 


TGAACCCTAG 


170501 


ATGAAGAAGA 


TTGCGTGAAG 


GAACGTACAT 


GCATCTCATA 


GATGATCATC 


170551 


TCTTCTTTCG 


GCAAATGCAG 


AGGCTGATCA 


CCATCCCAAG 


GAAATGGTTC 


170601 


TTCCTTTAAA 


TAACAAAATG 


CATAATCCCC 


CTGTTTCTTT 


CGCGAACCAA 


170651 


AACTCTGTGG 


GGAATGAATA 


TTCTTCGCAT 


AGGGATCTGC 


AAGATATTCT 


170701 


TTAAAAGAGT 


ATTGCATTCC 


ATGCTTTTTA 


GGCCCATGAA 


CACGAAATGC 


170751 


ATAAGACGAT 


TGATCAGAAA 


TACCCTCGAT 


CTCTATATGC 


CAAATCGCAC 


170801 


CCGTGCGGTG 


TGTATCGGGG 


TAAAGAGGGA 


CTTCTATGAC 


TTCTGAATTT 


170851 


TCGTCTGTTA 


AAGCAAGGAT 


GACTTCGGTA 


GCTTGTGAAG 


CATATAAAGC 


170901 


AAATCGATAG 


CGGTTTGGGG 


AAATTTTAGA 


AGCCCCAAGA 


GGTAAAGGAA 


170951 


CTGAGGGATA 


AGAAGAAACT 


TTTTCCATCG 


TCGATCAGAT 


ATCATAACGG 


171001 


CGCCAGAAAT 


ACAAGGGAGT 


TTTTGATTTT 


AAGCCTCTGA 


AATCGTTTAC 


171051 


TAACAAATCA 


CTCTTATGGT 


AACTTGAAAC 


AACAACAATC 


TATTACAAGG 


171101 


AGATTCCCTC 


ATGTCCAGGC 


AAAATGCTGA 


GGAAAATCTA 


AAAAATTTTG 


171151 


CTAAAGAATT 


AAAACTGCCG 


GATGTAGCTT 


TTGATCAGAA 


TAACACGTGC 


171201 


7\ rprnrpm * mrnmp 
ni 1 1 iftl 1 lb 






CACCTTACTT 


ATGAAGAACA 


171251 


TTCTGATCGT 


CXTTATGTCT 


ACGCTCCTCT 


GTTAGATGGA 


CTCCCTGATA 


171301 


ATACTCAGAG 


GAAATTGGCC 


TTATATGAGA 


AATTATTAGA 


AGGATCTATG 


171351 


CTTGGCGGTC 


AGATGGCTGG 


AGGTGGAGTA 


GGTGTTGCTA 


CTAAGGAACA 



171401 GCTCATTCTC ATGCACTGCG 

171451 TACTAAAAGC ATTTGCTCAA 

171501 ACCGTGTGTG CTGATATTTG 

171551 GCCACAAATG CCTCAAGGCG 

171601 TTCGTGCTTA ATTAAGCATA 

171651 AGATTGATAA AGCTCTTCTT 

1717 01 TTTTGTCCTG AATCTGTTTT 

171751 AGTATTAAAA ATAGAAATCT 

171801 GCTGGTTACC TTGGAGCAGA 

171851 ACGTGTGATC ACTCTGAGAC 

1719 01 ATGAAACTGT TTGGTGCAAA 

171951 ATGCTTCCTT ACTTGAAGAA 

172 0 01 CTCTGTAGAG AGTTACATGA 

172 051 TGATTAGTGT AGATTCTTTG 

172101 AGCCGTTCTC CATCTTTAGA 

172151 TGTATCCGTA GGGTTTGAAG 

172201 ATAAAGATAT GTATGCTGGT 

172251 GATGTTCCTT TTTAATTCCT 

1723 01 TCATGCTCAA GCCTCTGGGC 

1723 51 TGCCCTTTTG GCATTTTAAG 

172401 GAGTTTGAAC CCTCGTATCT 

172451 CGGGGAGATT GAACTCCTTT 

172 501 TTGTCCTCTT -AGGCTTAGGG 

172 551 TTCCAAACCT ATGCGACACT 

172601 CACAGTCAAT ATCATCTTAC 

172 651 AAGAATTCTT AGTGGGGTTG 

172701 TACCCACGTT ATAATAAGGT 



TTTTGGATAT GAAATATGCT GAAACGAATT 
TTGTTTATTG AAACTGTTGT TAAGTGGCGT 
TGCTGGTAGA GAACCTTCTG TAGATACAAT 
GTGGTGGAAT GCAACCTCCT CCTACAGGTA 
TAAGTACTAC ATCTTTTTTC TAAATTTAAA 
AGAGAAGAAA TGACTTTATT GACTTCTCGT 
GAGAAGTTCT TAAGATGTAT TAATTAAAGA 
AAAGGCTATC TTATGATGTT TGGGCATTTT 
TCCTGAAGAG CGAATGACTT CCAAAGGAAA 
TGGGAGTGAA GACTCGAGTT GGAATGAAAG 
TGCAATATTT GGCACAATCG CTATGATAAG 
AGGCTCAGGA GTCATTGTTG CTGGCGATAT 
GCAAAGATGG TTCACCGCAA TCTTCTTTAG 
AAATTCAGTC CTTTCGGTCG CAATGAAGGC 
AGACAATCAT CAGCAAGTGG GATATGAATC 
GTGAAGCACT GGACGCAGAA GCTATTAAAG 
TATGGTCAAG AACAGCAGTA TGTCTGTGAA 
AGTCATTAAA GGAGAGTTTG TGGTTTTATT 
GTAATCGTGT TAAGGCAGAT GCTATAGTCC 
GATGCAAAAA ATGCAGCTTC TTTTGAAGCC 
CCCCGCTTTA GAAAACTTTC AAGGAAAAAC 
ATAGTAGTCC TAAAGCTAAG GAAAAACGCA 
AAAAATGAAG AGCTCACCTC TGATGTTGTT 
AACTCGTGTC TTACGTAAAG CAAAGTGTTC 
CTACAATTTC TGAATTGCGG CTTTCTGCCG 
TCCTCAGGAA TTTTGTCATT AAACTATGAC 
AGATCGTAAT CTTGAAACTC CTCTTTCTAA 
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172751 
172801 
172851 
172901 
172951 
173001 
173051 
173101 
173151 
173201 
173251 
173301 
173351 
173401 
173451 
173501 
173551 
173601 
173651 
173701 
173751 
173801 
173851 
173901 
173951 
174001 
174051 



AGTCACGGTT 
AAGAAGCAGC 
AGGAATGCTG 
TCTGGGAAAA 
CCATCGCCAA 
TGTGTGGATC 
TAAAGATCAC 
GTTTAGACCT 
GCAGGTGGGG 
GCTTCCTATA 
ATGGCGCCTC 
TCTGTTGAGA 
TGCGATTACA 
TTGCAACTCT 
GGTTTCTTTT 
AGCCGAAACC 
ATAAAACATT 
CGTGCAGGGG 
ATCTTCGGTA 
AAAAAGAAGA 
TCTATTCTTT 
TTATTTATGT 
TCATTTTTTT 
GTAATTACCT 
GCAAAGCGGT 
AAGTTGCGGC 
GTAAAAAAAC 



ATCGGTATCG 
CATTTTCGAA 
ATGAAATTAC 
GAGTTCCCTA 
AGAGAAAATG 
CACACTTTAT 
ACCGTCTTGA 
CAAGCCTGGA 
CTACAGTCCT 
AATGTCACGG 
CTATAAAATG 
TTTGTAGTAC 
TATGCTTTAA 
AACAGGAGCT 
CCAATAACGA 
TCCGAGCCGT 
GCATTCTGAT 
CTATTACAGC 
GCTTGGGCAC 
AGACCGTTAT 
ATTACTTAGA 
TTTAGTAATG 
TATTAAAGTT 
GTCTAATTAG 
AAAAAGACAG 
TAAACGTACG 
CTGCAGTTCG 



TTCCCAAAAT 
GGCGTATATC 
CCCTAAGAAA 
GTATTGATAC 
GGACTCCTAT 
CGTTGTCCGT 
TAGGGAAAGG 
AAATCCATGC 
CGGGATTCTC 
GGATCATTCC 
GGAGATGTCT 
CGATGCTGAG 
AATATTGTAA 
ATGGTAGTCT 
TGTTTTAGCT 
TATGGAGACT 
ATTGCTGATA 
AGCATTATTC 
ATCTTGATAT 
CCAAAATATG 
AAATAGTCTT 
ACTTTTATTT 
TTCAATCGTC 
GGGAATAAAG 
CTTCAAGAGC 
GTTAAAAAAG 
TAAGACGGCT 



GGCGGATGCT 
TCACTCGAGA 
TTGGCAGAGG 
TAAGGTCTTG 
TGGCTGTTTC 
TATCAAGGAC 
GGTCACTTTT 
TTACTATGAA 
TCGGCGTTAG 
TGCTACAGAG 
ATGTAGGAAT 
GGACGTCTTA 
ACCGACACGT 
CTCTAGGAGA 
GAAGATCTTT 
TCCTCTAGTT 
TGAAAAATCT 
TTGCAGAGAT 
TGCAGGTACT 
CTTCAGGTTT 
TCTAAGTAGT 
TAGTTTTTTT 
CCTGCCGATA 
ATGATTGGAG 
TGTACGGAAG 
CTACTGTTCG 
GCTAAAAAGA 



ATCTTTAGGA 
TCTTGTGAAC 
TTGCTCTGAA 
GGAAAAGATG 
CAAGGGTTCT 
GTCCTAAGTC 
GACTCTGGAG 
AGAAGACATG 
CAGTTTTAGA 
AATGCTATCG 
GTCGGGGCTT 
TCCTCGCTGA 
ATTATAGATT 
AGAGGTTGCA 
TAGAGGCGTC 
AAGAAGTATG 
AGGCAGTAAC 
TTTTGGAAGA 
GCATATCATG 
TGGTGTTCGT 
TGCTTTCTAT 
AAAATAAAAG 
GATCAGGTAA 
CGCAAAAAAA 
CCTGCTAAAA 
CAAAACCGCT 
CAGTAGCAAA 
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174101 GAAGACTACA GCTAAGAGAA CAGTTCGTAA GACTGTTGCT AAGAAGCCTG 

174151 CAGTTAAGAA AGTTGCTGCT AAACGTGTAG TAAAAAAGAC AGTAGCAAAG 

174201 AAGACTACAG CTAAGAGAGC GGTTCGCAAG ACTGTTGCTA AGAAGCCTGT 

17 4251 AGCTAGAAAA ACTACAGTGG CTAAAGGTTC TCCTAAGAAA GCTGCAGCCT 

17 43 01 GTGCTTTAGC ATGCCACAAA AACCATAAGC ATACATCTAG TTGTAAACGT 

174351 GTCTGTTCTT CAACAGCTAC GAGAAAGCAT GGCTCTAAAA GCCGTGTTCG 

174401 TACAGCTCAT GGCTGGCGTC ACCAACTGAT CAAAATGATG TCTCGATAAT 

1744 51 TTGTGATTTT CGCATTATTG CTCATGTTAA CGGGAAAGGG AAACATTGGG 

174 501 TTTCCTTCCC GTTTTTCTTT TTAAGGTTAA AAAGCTTTAT AGAGCGAGAT 

17 4551 CTTCAGGCTT CATGCTGTAC AGTTGGTAGG AAAATACGTA TAGTAGGTTC 

174601 AGGATACTAC TTTTTTGACT CTACCTATGC AAAAATCCTT AACGAGTTTT 

174 651 GATGACTTTT CCCAGGCGTA TGCAGAGAAA GTGCCCGCTA TAGCTCTTAT 

1747 01 AGGGAGTGCT TTGGAAGACG ATAAAGATGC GCTGATTGAA TTATTAGTCT 

174751 CTGAGAGCTT CAAAGAGCTC GGTGGTCAGG GACTCATGCC AGCAACCCTC 

174801 ATGTCTTGGA CCGAGACGTT TGCACTCTTT CAAGAGCATG AAACTTTGGG 

174851 GATTATTCAT GCAGAGAAAT TCCCTCTAGC AACTAAGGAA TTTCTAAGCC 

174901 GCTATGCTCG GAATCCTCAA CCTCACCTTA CGATTTTGAT CTTCACCACA 

174951 AAACAAGAAT GCTTTCGAGA ACTGTCAAAA GCCTTGCCAT CGGCTCTTTC 

17 50 01 TTTGAGTTTA TTTGGTGAGT GGCCCGCAGA TCGTCAGAAA AGGATCATAC 

17 5051 GCCTCCTGTT GCAAAGAGCT GAGCGTGTGG GGATTTCTTG CTCTCAATCA 

17 5101 TTGGCATCTT TGTTTTTGCG TGCACTTGCT TCAACCTCTC TTCCTGATAT 

17 5151 TCTCAGTGAA TTCGATAAGC TACTGTGCTC TGTTGGCAAG AAAACGTCCT 

17 5201 TGGATCACTC TGATATTAAA GAGCTCGTTG TCAAAAAAGA AAAGGCTTCC 

17 5251 CTATGGAAAT TTCGAGACTC TCTATTGAAG AGGGATCCGG TAGAAGGTCA 

17 53 01 CCAGCAGTTG CATTTTCTAC TCGAGGATGG TGAAGATCCC TTGGGGATTA 

175351 TTACTTTCCT TCGTACCCAA TGTCTCTATG GTTTACGTAG TATTGAAGAG 

17 5401 GG ATCGAAAG AAAATAAACA CCGAATGTTC GTCCTTTATG GAAAGGAGAG 
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175451 ACTACACCAA GCCTTAAATT 

17 5501 ATAATGTCCA AGATCCTATA 

17 5551 GTTAACCTGT GACTTTATAT 

175601 GTAGAGACTC TCCCCTCCGT 

175651 GCTGATTGTA GAAAGTGATC 

17 5701 AAATTCCCGA AGTTCATAAA 

17 5751 CGCCTCCCTA AGGCTTGGGA 

175801 GGAGAATTGG GG ACTGATCT 

17 5851 CTGGAGCGAG TTTAGTGCGT 

17 59 01 GCTTTTTCAG GTCCCTGTTC 

17 5951 GCCTTCCCAG AGCTTTACGT 

17 6 001 AACGTGTAAA GTCGATAAAA 

17 6051 TCAGTATGTA TAGAAACTTC 

17 6101 TCTAGATACT TTACCTTCCT 

17 6151 CGGGCCCGAG TGAACTTGTT 

17 6201 ACTGAGGACT TAGGTTCTGT 

17 6251 ATTTCTTTTC CACATCCCGA 

17 6301 GGTATCATTG TCAGGAGTTT 

17 63 51 TCTTGTGATC ATAAGCAAAA 

17 64 01 TAATCATAGG TTTGCGTCTT 

17 6451 TTATCAATAC TTATTTTATA 

17 6501 GACAGCACCC ACAGAATCTC 

17 6551 CGGAACCTCT ATCCCCAAAT 

17 6601 ATTACAATAT CTCCTCCTTC 

17 66 51 TGAAGATATC TCGGTCTTCA 

17 6701 CTTCTGTGGC TTCGGATGTG 

17 6751 GAGGATCCTG AGCCCCCAGA 



CTTTATTTTA TGCAGAAACC TTAATTAAAA 
GTAGCTGTGG AAACTTTAGT GATTAGAATG 
CTTCTTCCCA ATACTCTCGG TACCCGTGCT 
TATAGGAGAA TTAGTTCATA GACTAGATGG 
GTGGGGGTAG GGCATTTCTA AGTTTATGGA 
TTTCCTCTTG CTATTCTTAG TAAACATGCG 
TTTTTATCTA GAGCCTATCG TAAAACACGG 
CTGATGCGGG TCTTCCCTGT ATTGCAGATC 
CGTGCACGTG CTTTGGGGAT TCCTGTGCAG 
GATAACGTTA GCGCTCATGC TTTCAGGCTT 
TTTTGGGATA CCTCCCGCAA AGTCCTAAGG 
AAGGCAGCGA CCTCCAAAGA GGTATCTACT 
TTATCGTAAC GTCTATACTT TTGAGTCTCT 
ATGCGGAGCT TTGTGTTGCC TCTGACCTTT 
CTCACACGCC AAGTGCAATC ATGGAGAACT 
GAAGCAAAGT ATAACCAAAG TCCCAACAAT 
ATTAATTTTT TCTTAGCTTG TTTTATCCCC 
CTTAGCATGT AAGCTCTTCG GGGATATTTT 
ATTAAAGGTT GCATGATTCT AAATAGCGCA 
TTTAAAATAA TCGTAATCCT CTATTGCGTG 
ATCTTTGCCA AATAAGTTTA GAGCGTTGAT 
GATCTTCACC ACCAACACTA CTCGAAGAAA 
CCTATTCCCG CCGATATCCA AATCCCGAGA 
TCTAGACGTA TCAACAGTTG CATCTTCGGC 
TTGCAGGAGG CCCTAGAAGT TCTTCATCTG 
TATGAGTTGG TTTGCCTTTG TGGCGGTGAT 
TTCTGAGGTA CGTACTCTCT ATGTAAATGG 
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17 68 01 AAGTTGGCAA ACGCATCAAG AAGCTGTACA GGAATTGCTG TATATTAGTG 

17 6851 AAGTACGGGG AGAGGCCGTT CGTCTTCTCT ATAACGATGG AAGCGGCATG 

1769 01 TCTCCTTGGC CCATCAGTCC TTGTCGTACT CTTCCTACTC TCGATCATCC 

17 6951 TTTATGTCAG GCCCTTTTGA CGGTTTGGGA ACAGTTTTTC TCTGCTCCTG 

177 001 AAAATCAGAA TCGTGAGTTT CTAGTGATTT TCTATGGGGA TGCATCGCCT 

17 7 051 TATATACAAC AGGCGTTAAC GCAATCTAGG CATAGTCCAC GTATTGTTGT 

177101 TGTAGGGATT TCCCCGACGG TCTTTATTCA AGGAGACTTT AGGGTCCATA 

177151 ATTACCGTGT TTCTGGAGAC TTCTTTAGTT CTCTGGATTG TCGGGG AACT 

177201 AGGGCGGAGA ACACCACGAT ACTGCCGTAT TCTTCGGGTC TTGAGGGTGT 

177251 TTTTCTGCCT TCTATCCGTT GTCCTTCTTT TACTTGGGCG GTGCGTTTTG 

1773 01 GAGAGCAATG CTTGGTTGCG AATAGGGGTG AGGATGTAGA AGATAGGGGA 
177351 GGTCTTTCTC AAGATGCCGA AAGATCACAG TTACCACACA GTGAAAGAGA 

1774 01 TCTAGCTGTT GTCATTGATT CTACGGATCC TAGTTCTATG AGTAGGCTTG 
177451 TAGAATGGTT GAATCAAGGA TCGCCTTCAT CAGATATGGA AATCAATCCC 
177501 TATCCCCAAC GGTGTCCTGA TGTAGCTCTT TCTGCGCTTT ATGCAATTTC 
177551 TAGAGTTTCA GGACTTGCGC AGGAATGGAT CCTAGCCTCT GTTCATGAGG 
177 601 GCTTAGACTT GCAGATCTGT TACTCTTTAA TTTTGATGCA CACGACGTTT 
177 651 GCGGTCCGGT ATTTTTTCTT ACTCTTTACA AATTATCCTC AGTCTAGAGA 
1777 01 GAGATTCCGT ACTGCACGAA TCGTAGCACA ATCTCTATAT TTACCAAGCA 
1777 51 TCCTTGTTCT TGTTTTTGAT TGTGGCAACG TCCTGCGTAA ACTATGGATG 
17 7801 CCTCAGGAAA TCTTACGAGC AATTTTTATT TCTGCGTCTA CAATTTCAGG 
177 851 GAGTATTGTC TTTGTAGAGT GCACTCGCTG GATGGGGCGA GGTCTTAGAC 
177 9 01 ATCGTGTACA ACAATTTGTG CAGCAACGAG TTATAGGAAG TGGCCTGCCT 
177 951 GTAGGAACAG TACGAGCTTC TTATCGCGAT CGTGCAGGCT TTATCATAGG 
17 8001 CTTTTTACAA ACTGTACATG GAGGACTTTA TTTGCCGGTA TCCATTATGG 
1780 51 TGCTTAACCA GATTGCAATA CAAGTTCCAC GTATCTTAGT ACGTCCAAAT 
178101 AACACTGCTG TTTATGATCT ACATAATAAA AGTGCTGAAG AAAATTGGAG 
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17 8151 CAGTGGTGAT GTATTAGCTG TTGGCCAAAC ATTAAACTTC ATCTTATGTG 

17 8201 CTTTCGTCTT GTTCGTAAAT CTATGGTTCT TTGTGAAGTC CGTATTACGC 

17 8251 CATTCTAGGC GTCGTCGTCG CTAATGTATC TTTGGAGACA TCTTGGTATT 

1783 01 TTTAACATAA GAGCACGGCA AAAAAAAGAA AGACCGAGAA AGAAGCTTTA 

178351 CCTTTAAGAG TTCTCTGATG TTTCCCGGAG CATCAGGAGT CCGAAGGGCA 

17 8401 GAGTCTAGGT TTTAGCCTTC CGCAGGGATT AGAAGGAATA TCCCATAATC 

178451 TCTTCCGCTA TTGTATCGTG GAAGAGACGG CCCTGTCTAT TTAGGGCAAG 

17 8501 ACATTGTCCA TGCACACTGA ATAGGTTTTG TAATTTTACA TCTTGCGTAA 

178551 GCATGGAGAT AAGTGTGGAG GGGAACTCCG CGAGGTCTGC TCCTTCAAGG 

178601 AGTCGGAGTC GCAGGGCTAA GGCTTCTTTG ATTCGTTCTT TTTTTGGGAG 

17 8651 AATTTCTGAG GTCTCTTGGG TAGGGAGATT CTTACGTACA GCACGTAGAT 

178701 AGTGAGAAAT ATGACTATAA TTTTTTGACC GCTCTCCGTG AAGGTATTGC 

17 87 51 GAAGCTGAAA CTCCTAAGCC TAAGAAAGGG CGATCTGTCC AGTAATAGAG 

17 8801 GTTGTGCTTT GCGGGGTAAT CTGGCTTGGC ATATGAAGCA AGTTCATAGC 

178851 GTTGGAACCC TTGGGAGAGT AGGAGATTTT CAGCAAGGAG GCTCATCTCA 

1789 01 GCTAGAATTT CTTCTTGGGC AATTGTGGGG ACTAGAATTT TGCGGTGTTT 

178951 ATAGAAGGAG GTGTGGGGAT CTATAGTGAG GTTGTATAGA GAAATGTGAG 

179001 TGATAGGGAG AGTCAGAGCT TGATGTAGGT CGCTTAGGAA TATCTCCAAA 

179051 GACTGTGTGG GCAGTCCGTA GATTAGGTCT ATAGAAAGAT TAGAGAATCC 

179101 GTGATTCTGG CATTCTTGCA GTGCTGTGAT TGCCGCAGAT GAAGAATGCG 

179151 TTCTTCCGAG GAGCTGTAGG ATAGAGTCGT CGAAGGTTTG TACGCCAACG 

179201 CTAATTCTAT TTATTGGAGT CTCTTGTAGT TGACGTAGAT AGCTTACGGT 

17 9251 GAGATTTTCG GGGTTGGCCT CTAAAGTAAT TTCCCGGGCA TGGGGGGCTA 

179301 GCTCTTTGAG GATGCGCTTA AGATCAAGAG GAGAAACTAA TGAAGGTGTT 

1793 51 CCCCCTCCAA AAAACACAGT CTCTATGAAA TGCGTCTCTT GGATGGGGGC 

179401 TAGCTTTCTT AGCCCCTCTT GAATTACAGC ATTACAATAG AGCGATACAG 

17 9451 ATTCACTTTT GTAGGGGATT GTATAAAAAC TGCAATAGCG ACATTTTTTT 
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17 9501 GTGCAGAAGG GAATATGAAT ATAAAGAGCT AGGGGAGCCT TACCATTCAT 

179551 TAGCATCGGG ATCAATGATA CCACCGCGTC TCCAGCGATT GCGATCGCTG 

17 9601 AAAGGGTCTT CATCGTCTTC GTGGGAATAG TCGACTTCTA CTGCCCCATC 

17 9651 TTCACGTCCT TCATCGGTGA GTTCCAGTTC TACGGTTTCG CCTGGGTCGA 

179701 TATCAATTTC TGTTTCTGTG GGTTCCACAA ATTCAATATC AGACATACTA 

17 9751 GGACCATCAG GGTAGACAGT CTTTGCGGGA CTGCGTCGTG GTGTGACGTA 

179801 CTCTTCAATC TCTCCGTTTT CTTTGAGGAG TTGTAGACGT TCTTTTTCTT 

17 9851 CGTAAATTTT ATGTTCTAAG ATTCGGATTT CTTCTTGGTG CCTGCTAATT 

17 9901 TCTTTTTTTG GCACTAAACC CAACTGCATC CATTGCGTAA GGTCATGAAG 

179951 TTCTGATTCT AATTTTTTAA GACGTTCGCT TTTCATCCAA GGGTTCCCAT 

180001 CCTGTATTCG AGAGCAGGTG CATAGCATTT TTTTGTTTTT TCAACTGCAG 

180051 AAAAAAAACA AAGATTTTTT TTTTTTTGAA AAATAAAAAT TTGCTATGAA 

180101 GCAGACATTT AGATCGTATT TATTGAGTTT AATTATTTTA TGGATTCCGA 

180151 GTTTGTGGGG CAAGTATATT CTTCGGATAT GGATTGGATC GAGTCTATGT 

180201 ATCAGAGATT TATGAATCAC GAGACTTTGG ATCCTTCTTG GAAGTATTTT 

180251 TTTGAAGGGT ATCAGCTCGG TCAAGCAGCA TCTCCATCAG AAGCTAGTAC 

180301 TAAGATTTCT GGGAATGAAA CTATTGCTAT GCTTCAAGAA CAAAAATCTC 

180351 AGTTTCTATG TACGATTTAT CGTTATTATG GATATTTGCA AAGTCAAATT 

180401 TCAACGCTTG CCCCAACTAC AGATTCTCGA TTCATTCAGG AAAAGATCGC 

1804 51 TAAGATTGAT CTGGATGAGC AGGTGCCTTC TGCGGGTCTA CTTCCTAAAG 

180501 CTCAGGTTTC GGTACGAGAG CTGATCGAAG CTTTAAAAAA ATGCTATTGC 

180551 GGAAGTCTTA CTTTAGAAAC CCTAACATGT ACTCCTGAGT TGCAGGAGTT 

180601 TGTTTGGAAT CTTATGGAGA AGCGACAAGT GGAGCGCTTT GCAGAGCAGC 

180651 TCCTTCGCTC CTATAAAGAC TTATGTAAAG CAACGTTTTT TGAAGAGTTC 

1807 01 TTACAGATAA AATTTACAGG TCAGAAACGT TTTTCTTTAG AGGGCGGAGA 

1807 51 GACCTTGGTC CCCATGTTGG AGCATCTTGT TCATTATGGA TCGGCATTAG 

180801 GAATTTCTAA CTACGTTTTA GGAATGGCCC ATCGAGGTCG TTTGAATGTA 
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180851 TTAACGAATG TTTTGGGAAA GCCTTACCGT TATGTCTTTA TGGAGTTTGA 

180901 AGACGATCCT GCAGCACGTG GTTTAGAGAG TGTTGGGGAT GTAAAGTACC 

180951 ATAAAGGGTA TGTGCTAAAG TCCCATCAGA AAGATAGGGA AACTACCTTT 

181001 GTGATGTTGC CAAACGCTAG TCATCTCGAA TCTGTAGATC CTATTGTCGA 

181051 GGGGGTCGTG GCTGCCTTGC AACACCAAGG TCACGCAGGT AAAGAGCAAA 

181101 GCAGCTTAGC AATTTTAGTT CATGGAGATG CAGCATTTTC TGGTCAGGGA 

181151 GTGGTTTATG AAACTCTCCA GCTGAGTCGT GTTCCAGGGT ATTCTACTGA 

181201 GGGTACGCTT CACATTGTTG TGAATAATTA CATAGGGTTT ACCGCAGTGC 

181251 CACGGGAGTC AAGGTCCACC CCTTATTGTA CGGATATTGC TAAAATGCTA 

1813 01 GGGATTCCTG TATTTCGAGT GAATAGCGAG GACGTCGTTG CCTGTATAGA 

1813 51 AGCTATAGAG TACGCTCTGC AAGTTCGTGA GAGATTTAGT TGTGATGTGA 

181401 TCATAGATCT CTGCTGTTAT CGCAAGTATG GACATAATGA AAGTGACGAT 

181451 CCCTCAGTAA CAGCTCCCTT ACTCTATGAT CAGATTAAGA GAAAGAAGAG 

181501 TATTCGCGAG CTGTTTAGGC AATATCTGTT GGAAGGGCAG TTTGCAGATA 

181551 TTTCTGAAGA AACTTTGGCA TCTATTGAAA AAGAGATTCA AGAGAGTCTG 

181601 AATCGTGAGT TTCAAGTATT GAAAGGGACG GATCCAGAAC CCTTTCCTAA 

1816 51 AAAAGAATGT CATCACTGCG ATCGCTTAAA TAACGGCGAG CTTATTTTGC 

1817 01 ATGATTGTGA TGTTTCTTTG GATCGCGAGA CTCTTTTTCA TATGAGCTCG 
181751 CGTCTTTGTG GTTTCCCTGA CAATTTTCAT CCCCATCCTA AAATTAAGAC 
181801 TCTTTTAGAA AAAAGAATGA AAATGGCAGA AGGTGGGGTT GGTTATGATT 
181851 GGGCGATGGC CGAAGAATTA GCCTTTGCTT CGCTATTAAT CGAAGGGTAC 
181901 AACCTGAGAC TCTCAGGTCA AGATTCTATT CGCGGGACAT TCAGCCAACG 
1819 51 ACATTTGGTA TGGAGTGATA CTGTGACTGG AGATACCTAC TCTCCATTGT 
182 001 ACCATCTTTC TGCAGAGCAG GGCTCTGTAG AAATGTATAA TTCTCCTCTT 
182 051 TCCGAATATG CAATTTTAGG GTTTGAGTAT GGCTATGCTC AACAGGCATT 
182101 AAAGACTTTA GTGTTATGGG AAGCGCAGTT TGGGGATTTT GCTAATGGTG 
182151 CACAAATCAT TTTCGATCAG TATATCTCTT CGGGAATTCA GAAGTGGGAT 
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1822 01 TTACACTCTG ACATTGTTCT 

1822 51 ACCCGAGCAT TCTTCATCTC 

1823 01 ACTGGAATTT TCAAGTGGTC 
182 3 51 ATTCTCAGAG AGCATGCTAA 

1824 01 TACTCCTAAG TTGCTGCTGA 
1824 51 AGTTCACAGA ACCTGGGGGA 
182 501 AATTATGATG CTTCTATTTT 
182 551 TTATGCAGAA ATGCTTCCTC 
182 601 GTATAGAGAG CTTGTATCCT 
182651 GATAAGTATT CTCATTTGAA 
182701 GAATATGGGG GCCTATGACT 
1827 51 CTGAGAAACT GCTATATATA 
182 801 GGATCAGCGA AGCTCAGTCG 
182 851 CTTTTCTTTA AGGTAAATTA 
182901 CAGAGTCGAT TAGCGAGGTG 

182 951 GCTCTGATTC AAGAAAACCA 

183 001 AAATCAGCTC ATTTATGCCC 
183 051 CAGAAGGCGA TGTTGTTCCT 
183101 GCAGGTGAAG GGGAAGAGCT 
183151 AGCTGAGATC ATTTGCTTTC 

1832 01 AGAATAAAAC GTTTATTCCT 
183251 GGTCTTTCTG CAGGAGATCG 

1833 01 TCGTAAGACA ATTTCGCGGC 
183351 TGCTCACGAC ATTCAATGAG 
183401 AAGGAAAAAC AAGAAGAGTT 
183451 TATGTCTTTC TTTGTGAAAG 
183 501 GAGTGAACGC CTATATTGAT 



GCTTCTTCCC CATGGGTATG AGGGCCAAGG 
GTATAGAACG TTATTTGCAA TTAGCCGCGA 
TTGCCTTCCA CTCCTGTGCA ATATTTTCGG 
GAGAGATCTT TCTTTGCCTT TGGTGATCTT 
GATATCCACA ATGTGTAAGT AGTATCGAGG 
TTCCGTGCTA TTCTCGAAGA TGCCGATCCT 
GGTATTGTGT TCGGGAAAGA TCTATTATGA 
AAGATCGGCG TAAGGACTTT TCTTGCTTGC 
TTAGCTCTTG AGGATTTAGT GAGCCTTATC 
ACATTTTGTT TGGCTACAAG AAGAATCCAA 
ATATGTTTAT GGCGTTGCAA GACATTCTTC 
GGACGTCCTC GGAGTAGTTC CACAGCTTCT 
TCAAGAGCTG GTCACGTGTA TGGAAACCCT 
TGACTACAGA AGTACGCATT CCTAATATTG 
ACCGTAGCTT CCTTGTTAGT TACAGAGGGT 
GGGCTTACTA GAAATTGAAA GTGATAAGGT 
CAGTATCGGG AAGAATTTTC TGGGAGGTTT 
GTAGGGGGGG TAGTGGGAAA AATAGAGCCC 
TGGAGATTCT CAGTCTAAAG AGACTATAGA 
CTCAGTCTGG GGTGCGTCAG TCTCCTCCAG 
CTTCGTGATC AGATGGACCA AGGATCCCAA 
AGGAGAAACT CGAGAACGCA TGACCTCGAT 
GTCTTTTGTC TGCTTTACAT GAGTCTGCGA 
GTCTATATGA CACCTCTTTT TCATTTGCGA 
TCTATCTCGA TATGGGGTGA AGTTAGGATT 
CTGTCTTAGA GGCTTTGAAG GCATATCCAC 
GGCGAGGAGA TTGTTTACCG TCACTATTAT 
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183 551 GACATTTCTA TTGCTGTAGG TATCGATCGA GGACTTGTGG TTCCTGTGAT 

183601 ACGCGATTGC GATAAACTTT CTAACGGGGA GATTGAGCAG AAACTCGCAG 

183 651 ATCTTGCCCT TCGGGCTCGT GAAGGCCTAC TTGCAATAGC GGAGCTTGAG 

1837 01 GGAGGAGGTT TGACAATTAC CAATGGAGGC GTATATGGAT CGCTACTTTC 

1837 51 GACTCCCATT ATCAATCCCC CGCAAGTGGG GATTTTGGGG ATGCATAAGA 

183801 TAGAAAAGCG CCCCGTTGTT CTTGATAATG AAATTGTAAT TGCAGATATG 

183 851 ATGTATGTCG CTTTAAGCTA TGATCATCGT CTTATTGATG GGAAAGAGGC 
183901 TGTTGGGTTT TTAGTCAAAG TGAAAGAAGG CCTAGAGAAT CCTGCCTCAT 
183951 TACTCGACTT GTAATTTCTC TGATTCTCAT AAAGGCTCTT TTAGAGCCTT 
184001 TTCAGATTTT TTAACCTCTT TTCTTATCAT GAAAAGGATT GCACTCATCT 
184051 TGAGTAAGGA ACAATGTAAA TTGAGTATCC GCACTTACAT TGGCTTGTAC 
184101 CGCAGCCTCA TTCAAGGCGC CTAACAACTT TCCGGGTAAG TACACATTGT 
184151 CTCTTTTAAA GTGAGGTTTC CCGGTATGAT CAAGAGGAGT GAAAGCTTCT 
184201 TCTGCGTGAA TATGTACTGA TGTGAAGCCA TTTTCTAAGA GATCTACAAA 
184251 CAGATTTTTC AAGAACTCCT TTGAGTCTTT TTCAGAAAGC GAAGAGCTCA 
1843 01 TCGGATCTTT TCTTTCTAAG ATATCTTGTA CATCCTTGTC GTAATAACCC 
1843 51 CAAGGAGTCC ATGCTAAAAC TAAAGTTGAA CTGACACCAG GAGTTGGAGA 
184401 ATTGAAAATT GCCCATTGAC CTGCTTTTTT TCTAGAGTTC ATTTTAGTTT 
184451 TCATAGCTTC CCACTGTTCC TCTCCTAAGA GATCTTGCAG TTGCTTTAGG 
184501 GGAGCCGAGA CTTCCTTATC CCCTTTCCAA TGCGAAGTTT CTGGTGAAAG 

184 551 GAGGAATAGC GAAGGTACGC GAGCCCCGTG ATCTCTAATT TCTGTGAGAT 
184601 TAGGAGCTTT AGCTGTGACA ACCTTAAATT AAGGACCCTT CTTTAGATGG 
184651 TGAATTTCAC AATTCGTAGC TTCTGTAATT TTGATTTCTT TAGTGAGTTT 
184701 TTCTTTGGGA GGAAGAGCGT CGTAAGGTAA GGAACAATGA TGTTAATAGG 
184 751 GCTAGTTTAG TTAAATAAGA ATAGGGGATC TCAGGTTTAG GAGTGATGGG 
184 801 TTGTTGCTCG ATTACAATGA TGGTTAATGG AAGAAGAGCG TTTCTAGAAA 
184 851 GTGCATCTGC TTCCTCGTCT TCTTTAGGCA ACAAGCTCAA GATGAGAAGC 

137 



184901 ACGAGCCCGA TGGCTAATCC CACGAGTGTG ATAATACCAG CAATGCTATA 

184951 CGCCTGTATT AGAGGGAAGC CTAAAGCTAA GAAGGTAACG AACGTGATTC 

185001 CAGCTGCTAA CATTAGTGCA GCGAAAAGTA TGTAAAGAAC AGCACGCGCA 

185051 ATTTGAAGAG CAAGACTTTT ACTTCTGAGT GTGCCTTCTG TAGCTTGGGT 

185101 TTTGTTAGCT TCAATCTCAG CGAGTTGCGC TTTTTTTGGC TTCACTGGTT 

185151 CGTGAACTTT ATGATTAGTT GATGCGCCTA ATTTCATAGA AAGTCTTTAA 

185201 GCTTAATTTT TTCCTAAAAA TTGTAGTTTA CATGTTGTTT TTCATCAAGA 

185251 AGATTACTGA ATTCTGAGTG GGTTGGAACG AAGCATGTTC TTCTAGAAGC 

185301 TAAGGTCCTT AGGGGAAAGG AAAAGAAGCT TGGCTGGATC TTTTTAATCT 

1853 51 CTGGGTAGGA GAAGGACGGC GGTTACATTA TTTCTATTTT TTGAATGTTG 

1854 01 GCCTGTGAGA TTGGGATCGG GATGGTGAGC AAGGTACCTT GAAGAAAAGA 
185451 AAGCGTCGTG TTCCGTGTAG GTACAGAGGT CTGAGATAAA AATGCGGTCT 
18 5501 TTAGAGATTC CTAAATTCGT AAGTTGCTTG CGAGCAATCG CACGCAGGTC 
18 5551 AAAATGGTTT TTGGGATTCA TAAAGGGAAG AAAGCTACGA GGAAATAACG 
185601 TAGCGTAATC GGGATAGATA GCATAATCTG GACCGATGGA AGGGCCGATA 
18 5651 GCTACGAAGA GATCTTGTGG TTTTGTATGA AATAATTTTT TCATAGTACC 
185701 TACGGTGACA GCATAGATAT TGCCAAGCAA TCCTCGCCAT CCGCTGTGTA 
185751 CATTTGCGAT TGCGTGGTGT TCTCGATCAT AAAAGATAGC TGCTTGGCAA 
185801 TCGGAATGGC GGATATGGAG AGAGAGGAGC GGAGACTGCG TGCACAGTCC 
185851 GTCTGCAGGT TGGTAGGTGG GGGATGTAGG TGTAACACAA CGTACGGAAG 
185901 TGCCGTGGCG TTGATGAAGG TCGCAATACT TCGGAGATTG GAGAGCTGAA 
185951 GCAATCTCAG GATTCTTGGC TGCGAAGACC GTGCCCTCGG CATCCTTTTG 
186001 TTTTGAAAAG ACTCCGTGGC GTATAGGCAA ACCATCGAAT TCTTCGAAAG 
1860 51 TCCAATAGTT CAGAGGGTGA GTGTGGAAGG ATAGAGTCAT ACTGTCAACT 
186101 TAGTTTCTTC AGGGTCTTTC CATACCCCAT GTTCTTGTAA AAGTCGAATT 
186151 AATTCTTCTT CAGCATCTTC CATGGGTATG TGAGCTTTTA CACAAGTATG 
186201 TTTTACATAA AGATCGATCA TCCCTGTTTT GGAACCTACA AATCCAAAAT 
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186251 


CTGCATCTGC 


CATTTCTCCA 


GGGCCATTCA 


CAATACAACC 


CATGATAGCG 


186301 


ATCTTAAGTC 


CTGGTAGGTG 


CTGCGTTCTC 


TTGCGGATAC 


GTGTGGTGAC 


186351 


TTCTTCAAGA 


TCAAAGAGGG 


TCCGACCACA 


CATAGGACAG 


GAGATGTACT 


186401 


CTGTTTTTAC 


AAGGCGCACC 


CCTGCATTTT 


GTAGAGTGCC 


AAAGGCAATT 


186451 


TTTAGCACGT 


CCTGTAGGGG 


AAGGTTCGGT 


AAGTCAAGAA 


CCACAGCTTC 


186501 


TCCAAGGCCA 


TCAAGAAGCA 


GAGCTCCAAA 


CTCTGTTGCT 


ATGGAAATAG 


186551 


CAGCTTCTTC 


TTTATTGTCA 


AAGTCCCTTG 


AAAATACTAG 


CTTGGTCGGT 


186601 


TTTCCTTGGT 


GTCCTTGTTT 


TTCAAAGAAA 


TCTCGGGAGG 


TATGAATGAA 


186651 


GGGGTCTGAA 


GCATGAAAAT 


GCACAAATGG 


AGCTTGATGA 


ACAGCAGGGC 


186701 


TATCCCAAAT 


CTCCTCATTG 


TGTTCATATA 


GGCAAGGCAC 


TTGATGGTGG 


186751 


TGGAAAACTA 


AAAAGTGTTC 


TCGAAGTACA 


TCTGTAATAG 


GAGCATCTTT 


186801 


TAACTCAGGG 


GGAACGACGA 


CCCCTTCAGG 


AGTTGTGAAT 


GCTTTTTCTT 


186851 


TTGTTACGGG 


ATTTACCCCC 


AAGTGTTCTA 


AGAGTTCTTC 


AGGAGTAAAG 


186901 


TCGGTAAGAT 


GGTGAGGATA 


GAGTTTTAAA 


AAGACTCCGT 


AGACGTCTCC 


186951 


CCAAAGTGTT 


GTTTTCGCAG 


GCTTCTCTGC 


AGCAGAAACA 


AAGTTTTCGG 


187001 


AGTGTTGTAG 


GGAAAAGGGA 


TTTTTCTTTT 


CTGGAAGGTC 


TAAGTAGATT 


187051 


TTCGTATGGC 


GTAGCAAGCT 


ATCACAGACA 


GGAATTTCTG 


TAGTGGGACA 


187101 


CCCTGTGAGA 


GAGCAGCGTA 


TGGTATCCCC 


GAGTCCTTCG 


GCAAGAAGAG 


187151 


TTCCGATTCC 


TACTGCGGAT 


TTTATGATCC 


CGTCCACGCC 


CATTCCAGCT 


187201 


TCAGTAACTC 


CAAGGTGAAG 


GGGATAGAGC 


CAGCCTCTAG 


CATCTAAGTC 


187251 


TTTAGCAAGT 


TGGCGGTATG 


CAGTTACCAT 


GATCTTCGGA 


TTGCTAGATT 


187301 


TCATTGAGAA 


GACAACATCT 


CTATAATTCA 


GCTTTTCACA 


TACAGCGATA 


187351 


TATTCAATTG 


CTGAGGCTAC 


CATTCCTTCG 


ATAGTGTCGC 


CATATTTTTG 
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ACCCGTGGTT 


CACTCCAATG 


CGCATAGCCT 


187451 


TGCCTAGTCG 


CT.TACATTTC 


TCTACTAAAG 


GAGCAAACTT 


TTCTTCAAGA 


187501 


CGCAGGAGAC 


TTTGGGCATA 


GCTTGCCTCT 


GTATAGATCT 


TCGTCCCCTT 


187551 


GAACATGTTC 


CTCTTATCTA 


TGTAGTTGCC 


TGGATTGATG 


CGAACCTTGT 



187 601 CAGCAAAATC AGCAACTAAC 

187 651 GCAACCAAAG GGATATTTAA 

187701 TTCACAGGCT TGTGCTTCCT 

187751 CACAATTATG TTCCGCTAGA 

187801 TCTGTGGTTA ATGTCGTTGT 

187 851 GCCTATGTAT AAGTTGCCTA 

187901 AATTGATGGC AGGGGTAATG 

187951 TTTTATCAAT TATAAAAGCC 

188001 CTCTAAGGAT GCCTACATTT 

188051 AGGAATTTTC AGAGAGAAAC 

188101 CATCAAGAGA GTATATGGAG 

188151 TACTTATCAA GGGGATAAAG 

188201 CTCGAATCTG TTTTTGTGTT 

188251 GAAGGAGTTC CCAAGAGGAG 

18 8301 TTTCTAATTT GAAATCGGTG 

188351 TCGAGATTAA CATGATAGCC 

188401 CCTATAACTG TAAAGTAGGC 

188451 TATAGCAACG ATTCCAGAAA 

188501 AAGCCGCAAA GGCAATTTTT 

188551 AGGAATTCTC CAAGCTTTGT 

188601 TGTGACTACT GGAGAAGACA 

188651 TATAATTTTA TAGTTTTCTA 

188701 AATTAATCTC -CTACTAAAAT 

188751 CTAGGGAATA AAATATTTTA 

188801 TCTAAGAAAT ACTTTTAGAA 

188851 GTTGTTCCTT CGGTTTTGTT 

188901 CACACAGAGT ACTGTTGTAA 



ATAGCTGCTT GAGGGAAGAA GTGGATATCT 
CCCTAGAGCA ATCAGACGTT CTTTAATTTT 
TGATTCCCTG TACAGTCACT CTGACAATAT 
GCGTAGATTT GCTCTACTGT ACTGTCAATG 
CATTGATTGG GTTTTTATTG AGTGGTCACT 
TTCTTACTGT ATGGGTTTTG CGTCGCGAGG 
AGTGTCATAA AAATCTCAAA AATTTCAGAG 
GAGAGAATTT TGTTGAAGCT GATAAAGCTT 
ATCAATTTTA CAGAACTTCA CCTCAATAAG 
GACGACCTTG GATGGCCGAG TGTTTCAGGC 
AGATTATTGT GAATTAAAAT AATTATTTCA 
ATTTTGCTTT AGAGGACGAT TCGTAAGGAA 
CGTAGAACGA ATCCCGAAGT TTGACAAAGA 
TGCTTGGGGC TCGTACTTTG GGAACGCTTT 
TATTTTGTGA TGGATGCGTG ATTTTATACA 
AGAAGGATTA GGGAGAGCAA GAATAAGGGC 
TCCGACAGAT GTTGCTACAA AGATCATGGC 
CAAAGGTATT GAGGAGGAGT AAAGCAGTTG 
ATGCATCGTC CTGATCCCGA TAACCTTTCT 
TTGTTCAACT GGAGATGCAC TTGATGTTCC 
TGTGATTTCC TATATTTCTA TGACAAGCCC 
TTTTAAATGC AATGTAATTA GTGTTTTACA 
AGGAGGAGAC ACTCTCATAT TTTCTATGAC 
TTTATAATGA GTGAAAAACA CCGCTGTTTG 
GAGAGTTTAC GTTTTATCTG CTGTTCAGCA 
TACAAGGGCA AGCTTTATAA CAAGAAGAAG 
GAATAAGGCC TCCTAGGATA ATCAAGCAGA 

140 



188951 TGGGAGGAGT CCCAGGAATA AAGAAAGCTG CAATTATTCC TAAAACTTCT 

189001 ATTACTAGAA TGGCAACAAG GATTTTAACT AAAAGTTTAA TCGTCTTCGT 

189051 CGTTTTATCT TGATGCTTTG CTTTTAGAGC TTTGGAATTT GGTTCCATGA 

189101 GGCCTAGATT TCCAGAGGAA GGGCGGTGGC TTGTGGGTAG GGGGGCTGAG 

189151 GACACGGGCA TCGTCTTTTA GTTAGAATTT TCTTTATTTC GCAAAAGCTT 

1892 01 TGTCTTATGG AGAGCAGAGG CGTTTCTATT TGCTTGAGAA AGTATAGCGT 
189251 GGTTTAGAAA GTTAGAAAAG ATTTTTTAAA AAATCTTCTA TGTTTTGGAT 

1893 01 GTTTTAAAGA GAGAAGTCTT ACCATTTTTT AAGAAAGGAT TTCCTTTTCA 
189351 CAAAAAGGCA TCTACTCTCT TTTGCTTAAC CAGGGGGAGA GATTAGGGTG 
189401 TGATGAGGGG ATGTCTAGCA AGAGTTGTCA AAATGATAAC CCACGGTTGA 
189451 TTTTGATTTT CTGCTTCTGA TCCAAATGTC TGCAGAGCAT CTATCAAGGC 
189501 TAACTTCACA CCATCAATCC ATTGCAGACG TAGACTAGTA GTCTCTTCGT 
189551 CTTTTGGAGA GCCAGGGAAA TGGGAGGAAA GCAAGGGGAG CTGAACTACG 
189601 GTTGTTTTAC GGCGCTTGGC CTCATTCAAA CAGTTAAGGT AGGCTTGCCT 
189 651 ACAAAAGGTA AACGCATCAT TAGGATTGTA GTTATAGTCA GAAGCTTTCG 
189701 GCCCTAGAAG TTGTCCTAAG AATTGCGGTA GTCCTGCTTT ACCTGCATTC 
1897 51 GTGGTTCCGT TTAGATTTTC CCATTTTGCT GAGCGGCATT CTCCTAATTG 
189801 TAGGGGCTGT TTTGAACGGA GAGGATCCGG AGATTTTTTC GAATTTTCCC 
189851 AACAGCGTAC ACTAGTTGCT TTCGCAAGTG CGAGACTTGT TCCTTTTACG 
189901 TTGTTTGCCA TGTTTGGGGT GGCTGCATTC ACAATCATGA CGATGCCTTG 
189951 AGTTTTATAG CGAGGCTTTT CAATGGGCCC TAACGTAGAG ATCAAAGTCA 
1900 01 GGTTTGAGTT TTTCAAATGC CAAGCATAAC CGGTTTGATT GTCAGCAGCG 
190051 ATGAAAGGCA TATCTGTGTT AGCCTGTAGA TCAGGTATAC GGTCCCAATT 
190101 TTCTTGTAAC AGTGTTTGAT GGGTTCTAGG GGAAGGAAGC GGGGGCTCTT 
190151 CTAGGACAGC TTCCGTAGGC ACTGGGGTGA GTGCTAGGGG CGGGACAGGA 
190201 GGAAGGTCTG TATCTTTTAT GATGGGCGTC GGCTGATGCG TTACGCTTAT 
190251 AGGACTTTTA GGTTCTCTTA AGAGAAAATA GAGAAGAATA CTAAAAGCCA 
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1903 01 AGACTGCTAG AGCGGTAAGA ATAAATAATA GAGGAATGCC TAAAATTATA 
190351 GCAGCGGTTC CAGAACTTAC AGCAATTCCT ATAATTAGAA TAAAAGCGAC 

1904 01 AAGGTCCAGA CCTTTGAGCT TATGAGATGC GAGTTGCGTG GGCATGCTTT 
190451 CGCTGCTGAT CAGCATGGTT AGATTTACGT TAGTCAAATT GGGTTCTGTG 
19 0501 GTGGACATAA AACTTTTTAA ATAAAAAACA AAAAGTTTAA AAAAGATTCT 
190551 TTTTTATGTG AAGTTATTTA TATTTTAAAT AGAAGTTGTT TATTTAAAAT 
190601 AATAAATAGA CAACATTTTC ACTTTAAAAA GTATTGGATG CGGCTTAGAG 
190651 CCAAGAATCT AGGGGGTGCT TGTGATCAGG ATAAATTGGC TGTGTTTTTA 
190701 GAAAGCTGTT CGATTGGAAG CACGGACGCT TTCCGACATA GACTAGGGAG 
190751 TCGTTAGATC GAAACGGGGT GGAATGATGG GAGGCTGGTC CTTGTCTGTA 
190801 AGGATGATAA TTTTTTGCTC TTCCTGGTTG TCTTGTTCCC ATCCAAAATC 
1908 51 TTGAAGAGAT GTGATGAGTG CCAATTTTAT AGCCTCGATC CATTCTGCTC 
19 0901 TTGTTTTCCC GAGGTTTAGA AGTCTTGATG GAGCAAATAG ATTACATCCA 
190951 ATGAGGGGGA GTTGAATCAC ATCAACGCCT ATGATTTCAG CTTCTTGGAA 
191001 CAGGTTATGA AACGCTTTCT TGCTTACTTC AAATGCTTGC TTAGGATCGT 
191051 TGTTACACTT AGCAGCTTCG GGACCAAGCA GTTGTGCTAA GAAGTGTGCT 
191101 TTGCCTGGGA CATGGTCGTT TGAGGTGTGA TCACTATTTT CCCATTTTGC 
191151 TGAGCGGCAT TCTCCTGGCT GTAGTTGGGA TCCAGAACGA GAGTGCGCTC 
191201 TAGGGAGCCT AGATGCGTTC CAACACTGTA GACTTGTAGC CAGGGATAGA 
191251 GCTTTATTCG TTCCCCCTCC TTCTCGGGAG ATGTTCTCGT TTGCTGCGTT 
1913 01 AACAATCATC ACCCTGCCTT GAGTTTTGAT TCTTGGAACT GCAATATCTC 

1913 51 CTGAGGTAGA TATAAAGATA AGCTTCGAGT CTTTAAGCTT CCAAATAAAA 

1914 01 CAGGGCTGCT GAGGTGTTTC TGTAGTAAAC GATGGGTCTA CAGCGGCTAA 
191451 GCTCGGAAGA AGATCCCAGT TTTGTTCAAG AAGAGCCTCA TAGCCAGGAG 
191501 TGACCGTGAT AGCGGTAAGT TCTTCTGGGG GTGTGGTTAG ATCAGCAAGA 
191551 GAGACTTCGG GGAGTGATTC GGGATCGGGA AGTGGATCGA TCTGCAAAGG 
191601 GAGGTCCTGT TGCTCAGGAG GCTGGGGAGA GGGAGACAGA GAACTTTGCT 
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191651 CAGATTCGGG TTCGATTTGA GGAAGAATCT CGTATAATTT AGGTTTCTTT 

1917 01 AAAAGGGAGT AAAGTGAGAC CGCAGCAATT GCTAATGTTA TGACAACTAC 

191751 GGCAGAAAAA ATAGGCATAG CCAAAATGCC AGCAGCAATG CTAAAGCCCA 

191801 AAGCGCACGC TAGGGCTAGT GTGAGGGTGA CAATTTTCAG TATTTTCTCG 

191851 ATTCTATCTG TACGGTTGAG AGGGAGTCTA ATCGGATAGG AATGTTTCGC 

1919 01 AGGAGTTCTG TAGAGACTGG CGTCTGTATA AGAGGGTAGG GGATTAGAAT 

191951 CTGTCATAAT ATTTTATTAA TAGATCAATT TTCAACTATT ATATTACTAA 

192001 TTTGTATTTT TATTAGATTT TTTTATAAAA CAAAATTAGT TTATTAATGC 

192 051 ATCCTATTGA AATAGATCTT TTTAGTTAAA AGGGACCATA AGTAGCTGTT 

192101 TGTGGTCTGT AAGGATTATA GTCATGGGAG TTGAGGGGTG CTGCGCAGCA 

192151 AAGGAGCGAA GAGCTTCCAT AAGACCTTTC TTTATATCAT TTACCCACTA 

1922 01 CGTACGGATG TGGTAAAGCT TATATGCACT GCTATTAGGC TTTGTTTGGT 

1922 51 TTACGGGTTC TAGTTCCAGC TTTCCTCCAG GTGAGTATAT AGAGGAAGAG 

1923 01 ATCAGAGGCA CTTGGACCAC AGTGGCTTGG TTATTGAGAG CTTCATCAAA 
1923 51 ACAGTTCAAA TAGGCTTTCT TAATAACATT GCTTAATTTC TCAGGATGTG 
192401 CTTTCAATTC TCCTTCATAT TTAGGACCAA GAAGTTGTGC TAAGAAATGT 
192451 GCTTCTCCTG GGTTCGTATC ATTTATTCGT CCAGTCTCTA TTGATCCAGG 
192 501 GTGCTGAGCG GCATTCACCC ACAGATAATC CTTTGCCAGT GTTTATTTTT 
192551 CCCCCAGATG TTCTCGTATT GTTCCAACAA GTAGGGTGTG TGGCTGCTGA 
192601 GAGAGCAGCA TTGGTTCCGG CTCCACCAGA TTGCATGTTC GAATTCGCTG 
192651 CGTTAACAAT CATGACTCTT CCTGAGGTTT TCAGGCGTGG TTTAGCGATG 
1927 01 TCTCCTGTAG TGGCTACTAA GGTAATTGGG GCTCCTTGAT GTTCCCAGAC 
1927 51 ATAGTATCTT TGATTAGGAT CTTGGAGAGT CCAGGATATA TTGATTTCTG 
192801 AGAGAGTATT GACTAGGGTC CATTCATTTC TCAGCAGCTT TTGGTAATCC 
192851 AGAGGGACTG AGGGTTGCAC TTGAGCTGCA ATATCCTTGG GG ACATGTTT 
192901 TTGGGGCCCA AGGAGCTCTT CTGTCGGTTT GGATGGCTCT CTTCTTCTTA 
192951 AAAGAAGACA AGAGAGTACG ACTGCTGCAA GGAGAGCAGC ACCAGTGGCT 
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193 001 ATAGCCATGA GAGGCATACC AAGAACTCCA GCAACAACTG CAGTCCCTAC 

193051 AGCTATGGCT AAAGCAAGAA TAAGAGCTGT GAGTTTTATT ATTTTAGCAA 

193101 TCACTTCTTT CTTAGTAAGC ACGGGTGATT CTGAGATCAA TCTCTCATCT 

193151 ACAGACAAAT GTGATTGATT CTCGTGTACA TCTGGGATTG AAGTGGTTTT 

193201 AGTCATAAAA TATCTATTTA AAGGAAAAGA TTATATCTTA ATTTGTTGTT 

193251 TTTTACAAAA AATCTTTTTC TTTGATAATA AAGATTTTTA TAAATTCTTT 

193 301 TTAATTTCAT TAATTTTAAT GAGTTAAGGC GTAGCAGGCG CACCTTATTG 

1933 51 TTTAGTTCAT AAGAGAACTC TAAGTGGGTG CCACACAGAG TTTTTTCTCC 

193401 ACTTGGTTCA AAAGGGTGGC GTTTGTGGTT TTATGGGGAA CGATTCTACC 

193451 CCAGCCATGG GATTCAAAGA CTTCTTAGAT CGACAATACT ACTTGAGATT 

193501 CTAATGGGGG GTGAAGACAT GTGTATTTCA TGGGCATGGG TAACGGAGGA 

193 5 51 CATCAGGTTT ACTTTCGAAG GCCCATATAT CATCTTCAGA GGTCAATAAA 

193 601 AGGGGGGGGT AGGGAGAACG ATGTTTTACA ACAGTGTAGA AATTAACTAT 

193 651 TTGTGAAATA TTTTCATTTA ATAACAATTT AATGAGTTTT AATTTGAAGA 

193701 CACAGGCAGT GGCTTATCTT TTGTCTATAA GTCGGTACAG TCAGACTTAT 

1937 51 AAGAATTGCC GCTTTTTGGG TCTTCATAAT CTCTATCTTA AACAAAAAGG 

193 801 AGGTCAATCA CAAACTGGGT GAGTTGTAGA GGTCTCAATA AGAAAGTTTG 

193 851 TAAGGTTAGG AGAGCTATAA TTGCGCCTGG GATTCTGCAT AGACGATACC 
1939 01 TGAAAAGATA GTGCCTAGGG TCGGTGTTTG CTGATGAGAA TGCCAAGAAG 
193951 CATAGTCATC AACCGTCTTA ACATATTTAT AAGTTTTCTT AGCAGCGAAT 

194 0 01 GGCACAGCTG CTAGAAAGCA AAACAAGATG ATAACAGCCA CCAATAGAAT 
194 051 TGTTCCTGAG ACAGCAAGGA CTGCAATGGC AGCTGGATGT CCAGTTTCTA 
194101 GAGTGAATTG CAGAGCAGGG CGGATTGCCA ATAAACTAAG AACTATGAGG 
194151 CAGCATCCGA TCACTAGGAG AACTTCCAAT ATGACAGAAG TGATTGTCCG 
1942 01 TGGAGAAGAC TCAGCTTGCT GCTTCCCXTT GAGTTCTATG AGTTTCTGAT 

1942 51 TTTGTGTGTT TGCGCTATTG ATTAAGGGAG CGGCTGGACT TTCAAAAAGA 

1943 01 ACGTCCGTAG CAGATACTGG AAGATATCCC ATTTATTTTA ATATCCTTTA 
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194351 
194401 
194451 
194501 
194551 
194601 
194651 
194701 
194751 
194801 
194851 
194901 
194951 
195001 
195051 
195101 
195151 
195201 
195251 
195301 
195351 
195401 
195451 
195501 
195551 
195601 
195651 



ATTAAAATTT 
AAAATTCTAT 
TTTAATTTAA 
AGTACCTAAG 
CGTGAGAGTG 
TTGCGTCATG 
CATCTCGAGA 
CGTTCGCGAA 
GATAGAGATC 
AAGGATCTTT 
GGCCATGTAT 
ATTGCGGGGG 
GGGATTTGGA 
GACATAGAGT 
TTTGTGGGAG 
TCTCCTGAAT 
CTGAATTGCC 
AGGAGGATAT 
CCTAAATTTT 
TAGGAGGGAT 
AAGGGTTATC 
TTGGGTTGGT 
AGAAGAATAC 
GTTTGCCACA 
ATAAAGTTGC 
GATAACAGGA 
CTAAGACAGC 



GCGCAGATAT 
ATTTTCTTTT 
CTTAGAAGAG 
AATTCGTGCG 
TTTTTTCTAT 
GAAAGATCTT 
TAACCAAACT 
TCTACTGTAC 
CTAAGAGAGA 
CAACACAAGC 
TTTACTGCTG 
CAATTCTCTA 
TAATTTGGGT 
GCATAGTAGT 
GAAGACGCGG 
CTCCCGCAGG 
TCATTGAGCA 
AGGCGCATTT 
CGGAAGCTGA 
ACTAGGGAAA 
TGTGAGTAAT 
GTTCCCAGTT 
TGTAGGTTGG 
AGCACGCGGT 
TAAGAACAAC 
ATATTTAAAA 
AAAGATTGCA 



TATACCTTAA 
GTGTAATATT 
AAGTGTTGCA 
TAAAATATTA 
TATTGAGGGG 
GGAGGCTTTA 
GAAAAGATGC 
CCTTGATTTT 
TCCTCAATAC 
AATAGTTAGC 
TAATCATAGC 
GAATAGAGAT 
GTTCGTATTG 
ATTGGTAGAT 
CCTCGGAAAT 
GAGTTCTGTA 
AGGATTTTGG 
TGCGAGGTTT 
AATTAAGAGA 
ACAGTGAATG 
GCAGAGGGTG 
TTCTTTAGAA 
TGGAGACCCT 
TTTAAGAGGG 
AGCAATAAAT 
GCAAACCAAT 
GTAATCAATA 



TTTAATTTTT 
TTTTTATAAA 
ATATCTTTTA 
CGTATTTGAT 
CTATTTTGAT 
GGAGCCTTTT 
GTACCAAGAG 
AGGATAGCCG 
TTTCTTGAGG 
GGATATTCTG 
CAAAGATTGT 
GCTCCCTCAA 
ATCGCAGTCT 
GGCTTCAGGA 
CGGGTTGCCA 
TTTATTTCTA 
GGAGGGAGTG 
CTTCAATTAA 
TGTTCTGGAT 
TTTAGCTTTC 
CGGGGAGATT 
CGATTGATTG 
TTTTGGTTTT 
TGGTTGCTCG 
GAAATTCCTG 
AAGTAGGGTC 
GAATTCGAGA 



ATTAGAGAAT 
AACATGAGTT 
TTAAAACAAG 
TTGTATAAGT 
TTTTCTTGAA 
TGTCCCTCAT 
TTTGTTTTTT 
TTTTGAGTGA 
TAGTTGGGCT 
GGTGCAGCTC 
TGCATTCTTG 
GCTGTATAAA 
GGATATAAGT 
GTTGTTGGTA 
GTATAAACGT 
CCCTGGTTTC 
GTATCTACGT 
CAGAGTCTTT 
TGCCTCCCGG 
CATATCTCGT 
CTGTAGGTCC 
CGATAGAGAT 
ATTTCTTTGT 
TTTATAAAGG 
TGAGGAAATA 
CCTATAGTAA 
CTGAATCTTA 
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195701 TCAAAGTTGC TGCAGGATTT TTTTTTAGTG TCTACTAGGG GATGATCTAC 

195751 AGTTAAAGAT ATGGGTGAGA TTGTAGCCAT AATTAAGGAA TTCACCTGAG 

195801 TTCCTTTAAA TTATATCTCT TTTATTGATA TTTAAATAAT TAAAAACATT 

195851 TCTGCATTTA AAAATATTTT TTAATTTGTC ATCTTTCTAA GTAAATATAA 

195901 AAAATAATAA GTTTATTTAG TTAAATTTTT TTTGAAAACA AGAAAGATAG 

195951 GGGAAAGGCT CGCGAGTAAA TCAAGCCTTT CCTGAATGCA AAGAGTCGCA 

19 6001 TAACTACTTA GAAAAGAGTT CGTCAGAATA GAAGTTGTAT TAATATAAAA 

196051 TTTTTATTTT CTAAGATTAG AAAGTAACTT TGACACAGCC ACCTTTGATG 

196101 CGGCACTGAC AAGCAAGACG TTCGTTAGAG TCTTCGGGTT CTCCTAGAAA 

196151 ATCGTATTCT GGTTCCGTAA ACTCAGAAAG ATTCTCACGT CCTTCTAAGA 

196201 CCTCTATCAC ACAAGTTCCA CAGACACCTT CTGTACAAGC AAAGGGAATG 

196251 CCCATGGATT CACAAGGCTC TGCGATCTCA CTATTGTCTT CTAACTCGAA 

19 6301 CTCTTGTTGT TCATCATCAG AGGTAATGAC TAGCTTGGCC ATGGAGTTTT 

196351 CCTTCTACGT ATTGTCGGAT AGATTAAAAT GAAGTTCGGA GTAGAGGGAT 

196401 TCGAACCCCC GACCTATTGC TCCCAAAGCA ACCGCGCTAA CCAAGCTGCG 

19 6451 CTATACTCCG TAAAATAAAA TTTTCGAGTA AAAGAGATCA ACGTTAGCAT 

196501 AGAAGGAAAT AGTCGACAAG ATCAAAGATG CTTCATAACG TCTCTTTAGA 

196551 GTTTTACTTG CAGATATATG GAAGAGGGAA GTATGAGAAA AGGCAGGGAT 

196601 TCTCTAGGAG GCTGTTTTTG TGTCTGGGAA GAAAGATGGT GTAAGGGGAA 

196651 TGATCTTTGT CCCTCTTAGC ATCCTAGTAC TAATCTTTTT ACCTCTTCCT 

196701 CAGATCCTTC TTGATTTTGG ATTGTGTATT AGTTTTGCAT TGTCTTTACT 

19 6751 AACGGTCTGT TGGGTCTTTA CCTTAAATTC AAGCAATTCA GCGAAGCTTT 

196801 TTCCTCCATT TTTCTTATAT CTTTGCCTAT TGCGGTTGGG ATTGAATCTT 

19 6851 GCATCAACAC GATGGATTGT CTCTTCAGGA ACCGCCTCTT CTCTGATTGT 

1969 01 TTCTTTAGGC AGTTTCTTCT CTTTAGGAAG TCTATGGGCA GCAACGTTTG 

196951 CGTGCCTCCT TCTTTTCTTT GTGAACTTTT TGATGGTTTC AAAGGGTTCG 

197 001 GAAAGAATCG CAGAGGTCCG TTCGCGGTTT TTCTTAGAGG CTCTTCCAGC 
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197 051 AAAACAGATG GCTTTAGATT CTGATCTTGT TTCTGGAAGA GCTTCTTATA 

197101 AGGCTGTCAA AAAACAAAAA AATGCCCTTA TAGAAGAAGG GGATTTCTTC 

197151 TCTGCCATGG AGGGGGTCTT TCGTTTTGTT AAAGGGGATG CAATTATTAG 

1972 01 TTGTATCCTT TTACTCGTGA ACGTAGTTTC TGTAACTTGT CTTTATTATA 

197251 CTTCGGGTTA TGCTCTTGAG CAGATGTGGT TTACAGTTTT AGGAGATGCT 

197301 TTAGTGAGTC AAGTACCTGC TTTACTTACT TCGTGTGCTG CAGCCACTCT 

197351 TATTAGTAAA ATCGATAAGG AAGAGAGCCT TTTAAATTAC CTGTTCGAAT 

197401 ACTACAAACA GTTGCGTCAG CATTTCAGGG TGGTGTCGTT ATTGATCTTT 

197451 TCTTTGTGCT GCATTCCCAG TTCTCCAAAA TTCCCTATCG TTTTGCTCGC 

197 501 GAGTCTTTTA TGGTTGGCGT ATCGAAAAGA AGAGCCTGCA TCAGAAGATT 

197551 CTTGTATAGA ACGTGCGTTC TCTTATGTTG AGGGGGCCTG CCCTAAGGAA 

197601 CAAGAATCAC AGTTCTATCA AGTATATCGT GCAGCATCCG AAGAAGTATT 

197651 TGAAGATTTA GGAGTTAGAT TGCCTGTGCT TACTTCTCTA CGTATTGAAG 

1977 01 AGCGTCCTTG GCTCCGAGTA TTTGGCCAGA ATGTATACTT AGATGAAATG 

197751 ACTCCAGAGG CTGTGCTTCC TTTCCTTAGA AACATCGCTC ATGAGGCTCT 

197801 CAATGCCGAG GTAGTTCAAA AGTACCTTGA GGAATCAGAG AGAGTGTTTG 

197851 GCATCGCTGT TGAAGACATC GTTCCTAAGA AAATCTCTTT AAGCTCTCTT 

197901 GTAGTTCTTT CTCGCCTCCT TGTTAGAGAA AGGGTATCGC TTAAGCTTTT 

197951 CCCAAAGATT CTAGAGGCCG TTGCGGTATA CCAAAATTCT GGAGACAGCT 

198001 TGGAGATCCT TGCGGAAAAA GTGCGAAAGT CTCTCGGATA TTGGATTGGG 

198051 AGAAGTCTCT GGGATCAGAA ACAAACCCTT GAGGTAATTA CCATAGATTT 

198101 TCATGTTGAA GAATTGATAA ACAGCTCATA CTCAAAGTCT AATCCTGTAA 

198151 TGCAAGAGAA TGTGATCCGT CGAGTAGACA GTCTTTTAGA ACGGTCGGTA 

198201 TTTAAAGATT TTCGAGCCAT AGTTACGAGC TGTGAAACAC GATTTGAGAT 

198251 GAAAAAAATG CTCGACCCAC ATTTCCCTGA TCTTTTGGTT TTATCTCATG 

19 83 01 ATGAGCTTCC TAAAGAAATC CCTATTTCCT TCTTAGGGAT CGTTTCAGAT 

198351 GAGGTTTTAG TTCCTTAATT TAATTTAATC TTCTGTAAGC ACTATTACTT 
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198401 TCGTATTTTT TTTGTTGGTG TAAATATTGT TTAAAAATTT TTTTTGAATT 

19 8451 AATCTAATAA ATAACTAGAT AAAAAAAAAT TTGTGAAAAC ACAGCAAACT 

198501 CAAAACATCA TAGAGGTTTG GAACTTCTAC TGGGAGACTC AGGAAATAGA 

198551 GTATCGCGAT AGCTTAATTG AGTTCTATTT GCCTTTAGTA AAAAGTGTGG 

198601 TTCATCGTTT GATTTCAGGG ATGCCTTCCC ATGTAAAGAC CGAGGATTTG 

198651 TATGCTTCGG GTGTTGAAGG TCTCGTCCGT GCGGTGGAAC GTTATAATCC 

198701 TGAGAGAAGT CGTCGTTTTG AAGGTTATGC GGTATTTCTG ATTAAGGCTG 

198751 CCATTATTGA TGATCTGCGT AAGCAAGACT GGGTTCCTCG TAGTGTCCAT 

198801 CAAAAAGCGA ATAAATTGTC AGGAGCTATG GATTCTCTTC GCCAGTCTTT 

198851 AGGCAAGGAA CCCACGGATC TTGAACTGTG TGAGTATCTC AATATTTCGC 

198901 AACAAGAGCT TTCGGGATGG TTTGTATCTG CCCGTCCTGC ATTAATCGTG 

19 8951 TCTCTGAATG AAGAGTGGCC TTCACAAAGT GATGAAGGAG CCGGAATGGC 

199 001 TCTTGAAGAG AGAATCCCCG ATGAACGTGC CGAGACAGGG TACGATGTTG 

199051 TAGATAAACA AGAATTTTCT TTATGTTTAG CC AATGCGAT TCAGGAACTT 

199101 GAGGAAAAGG AACGCAAGGT CATGGCCCTG TACTACTATG AAGAACTTGT 

199151 CCTTAAGGAA ATCGGTAAGG TCCTTGGGGT AAGTGAGTCT CGCGTCTCTC 

199201 AAATTCACTC TAAAGCATTG CTTAAGCTTC GCGCAGCACT CTCTGCATTT 

199251 CGATAAATAC AGTTCTCAGG TTTTAAGAGC AGTCCTAGAG CTAGGAGAAG 

199301 CCCTCCTACG ACACAGAGTA ATCCGTAAAG AATTTGTTTA GTCAACGCCA 

199351 TAACAATGCA GCCAGCAATT GTGAAGATTG CTCCTACAAT CAAGAGAAGA 

1994 01 ATGCGGGGCA GCCATTCTTT AATACGTTCG GCTAGGGTAG AGGTAGGAAG 

1994 51 GGGTGCCTTG CTAGTTGTGG CATTTAGACT AGTTGCGGCG TTTAGCATCG 

199501 ACGGAACCCC AATTGGAGAT GTAGACATAG CGACCTAATC TTTAGAGAAG 

199551 ATAGAAGTTA TGGTATCTCA GCAAAGGAAT TATTCTCAGC GTACAATCTT 

199601 TTCTTTATTG TTAGATAGGT GGTGCCTTGC ATATAAGCTT TAGGATCGAT 

199651 AAGATAGAGC TTGGCAATTC TTAATTATGT ACGATCACTC ATGCAATCCT 

199701 GGTTACAATC TTTACAAGAG CGAAATATTT TAGAGAATTT TACCGCAGGT 
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199751 
199801 
199851 
199901 
199951 
200001 
200051 
200101 
200151 
200201 
200251 
200301 
200351 
200401 
200451 
200501 
200551 
200601 
200651 
200701 
200751 
200801 
200851 
200901 
200951 
201001 
201051 



TTGGAATCCG 
CGCACCTGCT 
GACTCGCTGC 
GGTATGGTTG 
GACAAGTGAA 
GCTATCTTCC 
ATCTCCCTGA 
CCAAATGCTA 
GAATTAGCTA 
TATCACTTAT 
TCAGTGGGGG 
TGGGTCAGGC 
AAAAAAATAG 
AACCTCTCCT 
CCATCCCTAA 
CAAGATATTG 
TGTAGCCCAA 
AGGCTCTTTC 
TCGGAAAAAG 
GGATAAATCC 
TGGGACTATG 
GGGGTATATA 
AGAACAAGAC 
AACGAAAGCT 
TGCAAACGAA 
GTCTTAAACA 
CCCAGAGAAA 



TAGAGGGACC 
CTACATATTG 
TCTGGGGATT 
GAGATCCCTC 
GTTTTTGATA 
CGGGGTGACT 
TTGATTTCTT 
GTGAAAGATA 
TACCGAGTTT 
TTAAAAATTA 
AATATTACTT 
CTACGGCCTT 
GGAAAACAGA 
TTTGAGCTGT 
AATTGCTCGT 
ATAGGCGTGT 
GATATCTTAA 
TGTAACTCGT 
ATTTTCATGA 
GAGGTGTTAG 
TAAATCTAAA 
TTAATAATGT 
ATCTGTTATG 
TGTTCTATAT 
TATTGGTCTT 
TGATAGATCA 
ACGCGGGACT 



TATCGCCGCT 
GTCATTGGAT 
ACCCCCATAG 
AGGGAAACAG 
ACAGTCAAAA 
CTTGTAAATA 
AGGGGATATA 
CAATAAAGCA 
AGCTATTTAA 
TGGCACGATC 
CAGGAATCGA 
ACCTATCCTT 
GTCGGGAACT 
ACCAATACTT 
ACGTTAACTT 
ACAGACGGAT 
GTGCTATTCA 
AGCATGCATC 
ATTGTTTGCA 
GGAAACGTTG 
GGGGAAATTC 
GCCCATCGCT 
GTCACTATGT 
TTAAATTAGT 
ATTGGCTTAG 
TGGTTTTTCT 
TCTTGAAAGA 



TATTTAGGAT 
TGGGATTTGT 
CTTTAGTCGG 
AGCGAGAGAT 
GATCACGGCG 
ATGCAGACTG 
GGAAAACACT 
GCGGGTGCAT 
TCCTGCAATC 
TTGCAGTGCG 
TTTTATTCGC 
TATTAACGAA 
GTATGGCTCG 
ACTCCGTTTG 
TATTGAGCAA 
CCAGTTGCAG 
TGGAGATCTA 
CAGGGAATCT 
GGAGGGATGG 
GTTAGACCTA 
GAAGGCTAAT 
AATGAGCATA 
GTTGTTGGCT 
TTAAGGAGGC 
CTGTCATGGG 
GTCTCTGTCT 
ATACCCTAAC 



TTGATCCTAC 
TTCTTGAAGA 
GGGAGCCACA 
CGTTACTTCA 
TGTCTCCAGC 
GTTGCAGGAG 
TTCGTTTAGG 
TCTGATGAAG 
CTATGATTTT 
GTGGTAGCGA 
CGTAAAGGGT 
TGCTCAGGGG 
ATTCAGATTT 
CCCGATGATA 
TGAAGAAATT 
TGAAGGAATT 
GGGCTTGAAG 
TTCATCCTTA 
GGGCCTCATT 
TTTCTTGTTT 
TGAACAAAAA 
GTGTTTGTGA 
CAAGGTAAAA 
TAGGTAGCTT 
GAAAAATCTT 
ATAATCGGAC 
CACCGAGAGC 
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2 01101 TTGTAGGGTT TGAATCTTTA GAAGATTTTG TGAATTCATT GGAGAGACCA 

2 01151 CGAAAGATCA TGTTGATGAT TCAAGCAGGG AAACCTGTGG ATCAGAGCAT 

2 01201 TCATGCGTTA CTGCCTTTTC TAGAACCCGG CGATGTGATT ATCGATGGGG 

2 01251 GGAATAGCTA TTTTAAAGAT TCCGAACGAC GATGTAAAGA GTTGCAAGAA 

2 013 01 AAGGGGATTC TCTTCTTAGG CGTGGGGATT TCTGGAGGAG AAGAAGGTGC 

2 013 51 ACGTCACGGC CCATCAATTA TGCCTGGAGG AAATCCTGAG GCGTGGCCAT 

2014 01 TAGTGGCTCC TATTTTTCAA TCAATAGCAG CAAAAGTACA GGGCCGTCCC 

201451 TGCTGTTCTT GGGTAGGAAC TGGCGGTGCA GGCCACTATG TAAAGGCTGT 

2 01501 TCACAATGGT ATAGAATACG GCGATATCCA GTTGATATGC GAAGCTTACG 

2 01551 GTATCTTAAG AGATTTCCTA AAGCTCTCCG CAACTGCCGT TGCTACAATT 

2 01601 TTGAAAGAGT GGAATACTCT AGAGTTGGAA AGCTATCTAA TTCGTATTGC 

201651 TTCTGAAGTC CTAGCATTGA AAGATCCGGA AGGAATCCCT GTTATTGATA 

2 017 01 CGATTTTAGA TGTCGTGGGC CAAAAAGGTA CAGGAAAGTG GACCGCAATC 

2017 51 GATGCTTTAA ATTCTGGAGT TCCCCTTTCC TTAATCATAG GAGCTGTTCT 

2 01801 TGCTCGTTTC CTTTCTTCTT GGAAAGAGAT ACGCGAGCAA GCTGCCCGTA 

2 01851 ATTATCCAGG AACCCCCTTA ATATTTGAAA TGCCCCATGA TCCCTCGGTA 

2 019 01 TTCATACAAG ATGTCTTTCA TGCTTTATAC GCTTCCAAGA TCATCAGCTA 

2 01951 TGCTCAGGGA TTCATGCTTT TAGGAGAAGC TTCAAAAGAA TATAATTGGG 

2 02 001 GATTAGACCT AGGAGAAATT GCTTTGATGT GGCGCGGGGG ATGCATTATT 

202 051 CAAAGTGCAT TTTTAGATGT TATACATAAA GGATTTGCTG CCAACCCAGA 

2 02101 GAATACCTCG CTCATCTTCC AAGAATATTT CCGTGGAGCA TTACGCCATG 

2 02151 CGGAGATGGG ATGGCGTAGA ACAGTAGTGA CTGCAATTGG TGCAGGGCTA 

2 022 01 CCTATTCCCT GTTTAGCAGC AGCAATCAGG TTTTATGATG GCTATCGTAC 

2 02251 AGCAAGCTCT TCAATGTCGT TAGCTCAAGG ACTGCGAGAT TATTTTGGAG 

2023 01 CTCATACCTA CGAGCGTAAC GATCGCCCTC GAGGAGAGTT CTATCATACC 

2023 51 GATTGGGTGC ACACGAAAAC TACAGAAAGA GTGAAGTAAA AATAAAAAAT 

202401 CTCGAGAAGA CCCTAGGTAG CTCGAGATTT AGTTCACCAC TACCGAATTT 
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202451 
202501 
202551 
202601 
202651 
202701 
202751 
202801 
202851 
202901 
202951 
203001 
203051 
203101 
203151 
203201 
203251 
203301 
203351 
203401 
203451 
203501 
203551 
203601 
203651 
203701 
203751 



TCAATTGTAA 
AGAACTTCAA 
CATACGTTTT 
TAATATCTCC 
ATCGTTTCTC 
CTTGAAGAGT 
TTCCACGAGA 
ATGGGCTCCT 
GTAATCCCCA 
TCAGCTTGTC 
GCAAGAACTA 
ACGTTTATCT 
TGATAATATT 
GGATCCGGAT 
TAAGACGACT 
CAAATTCTCG 
AAGAAGCCAC 
TATAGTTAAA 
AAGTATCAAA 
GGATTGATCT 
TTTCGTGACT 
CAATAAAAAA 
GCTTTAGGGA 
AGCCGCCATA 
TGCGTACGTA 
ACTAAAGCTT 
ATCGATAATT 



GCAGTGTGCT 
TGAAAGCTGT 
TTTCCTTTCT 
GCCATAACAC 
TGGCAATGAC 
TGTTGTGGAA 
TTCTGCTTTA 
CGTTAATAAG 
AGACGGTAGT 
ATTGAAATCC 
GACGGTGCTG 
AAACAGAGGT 
CACATGAACC 
ATCCTGAGGG 
TTATAGATGA 
AATGATTCTT 
AACGAAAACC 
GCAGAATCAT 
ATCAGAAGAA 
CTTTGAAGCC 
GTATCGCCGA 
ACCCACCTGA 
GAAAGGCCCC 
AAAGTAATGC 
GACCATAATG 
TAAGCTCTGT 
GCTTTCAGGA 



GATTTTAAGC 
ATTGGGAATG 
TTTGCTTTTC 
TTTGCGGTCA 
TTTTTTGTTA 
TCACGTCCAC 
TCTCTATGGA 
AACCTCTAAT 
CAAAGGATCC 
GAGACAATCT 
ATCTAGCATT 
TCATAATGTT 
CAAGGCTCTT 
GTTATCAATA 
CACTTGGAGC 
TCAAAGATAA 
AAAGCCTAAA 
TGAGCTGTAG 
TCTATAGGAT 
TTCCAAAGGA 
TCTTCACATC 
CCAGGGCGTA 
TATACCTAAG 
GGTCTCCTTT 
CCAACGTAAG 
TTCTGCAGGT 
TTGCAGGGAT 
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GTCAATGTTA 
GAAACTTTTC 
CCACAGCTTG 
CGTTCTTAGA 
ATGGCAGCTT 
AAGCTTTTCG 
CTAAACAAGA 
TTGATGATCG 
ATAACCTTTA 
CATTTAAAGG 
TCTGTTTTTA 
GCTCAGATAT 
CCACATGCTC 
TCTAGAACTT 
CGTTGCAATA 
TCTCAAGATG 
GAGTGACTGC 
TCTTCCTAAA 
AAATTCCAGC 
GTTTTTGCAG 
CTTCACTTTT 
AGGAACCTTC 
ACTTCAAACG 
TTTTAATTCC 
GGT-CATAATG 
GCTTTTGGAG 
CCCCTGACCT 



ATCTAATTTT 
CAAATTCCTT 
CGTTTCCTAG 
AAGCGCACGA 
GGATGGGAAT 
CAGATACTTC 
AAAAGCATCT 
ATCCCTTACG 
GTTACTGACT 
GAGTTCGTAA 
CGCAGATCCC 
TCTTGAGGGG 
GATGATCGCA 
TCCCGTTTTT 
ATATCTAAGT 
AAGAAGTCCT 
TTTCTTGTTC 
GCATCTTTCA 
AAAAACTACC 
GATGTTTTGT 
TTGAGATTGG 
TATAAATGTT 
AGGAGCCTTT 
CCGCTAATAA 
AGAATCAAAG 
GAGGAACAAG 
GTTTTTGCAG 



203801 
203851 
203901 
203951 
204001 
204051 
204101 
204151 
204201 
204251 
204301 
204351 
204401 
204451 
204501 
204551 
204601 
204651 
204701 
204751 
204801 
204851 
204901 
204951 
205001 
205051 
205101 



AACAGGCAAT 
TGAGCAATTC 
AGGAATGATC 
TTTGTGCCTG 
CATGCAGATA 
GGTATCAATC 
TGGTGACAGG 
ATGGAATCTA 
ACTTTCTAAA 
CTATGATTGA 
TACTGTAGTG 
AAGGCTCCTT 
TCTCTTTATT 
GAAAGTATAA 
TGATTTGAGG 
GAAGAAGTGC 
AGAAAGGGAA 
TATGTTCATC 
CATAATAAAA 
GATCCAAGAG 
TCTTCAATGT 
GTTCAGGTCT 
AAATTTATAG 
TAATATAACA 
TACAGCCTGT 
TTTTCGGTGT 
TGCTGTACTC 



AATGTTCGTA 
TCACGGGATC 
TCTAAATCTC 
CACCCCCTGG 
GAGATCGAGA 
AGGTTCAGTT 
ATGAGCTTTA 
AGAGCTGCTC 
AGGCGATCAG 
AAAATTGCGA 
TTATCTAGGT 
CAGACAAAGA 
AGCAAAATGA 
TAGATGAGAT 
TAGAGTTCCA 
TCTCCGTTAG 
ACCGATTCTA 
TAAAGTCCTT 
TTCATGAAAG 
TATAAGAGAG 
AGTGAAGAAC 
TCGCAATCGA 
AAGTAAAATA 
CGCTGATACT 
GTATAGGAAT 
GCCAGCCAGC 
GCTATTGTAG 



GTGTCTAGGC 
AGCGGCAGGT 
TTTCAAGGGC 
GCGGCATCTA 
GACTTCATAC 
GATACACCTC 
ATTGTAATGC 
ACGCATCTCC 
CAATTGTAGA 
ATGTTCTCTA 
TTATTCCTGG 
CCAGCTTCTC 
TCATGCGGTC 
GTAGAAACCA 
GAGTGGCTAG 
TATGAAACTG 
AAAGAGATAA 
GCGGGTCATT 
CATTGGGGTT 
GAATGAGTGT 
TGGAAAGTCA 
TTTTTTCAAG 
ATTCCATGAA 
CAAGTCACAA 
ACCCAAGTCA 
GCTTAGCCTG 
CTTTGGTATG 



CTATATAATC 
AGATCAATCT 
CAGGTAGACA 
CAATAAGTAA 
GAAAAGTCCA 
TCCTTCATAT 
CACGCTCTCT 
CGTTCTTCTA 
CTTCCCGTGA 
TCTTATATTC 
TTTCATAAAG 
AAATACTTTT 
ATCTTTAGCT 
CAAGGGTTTA 
CAGTAGAAAA 
ATTCCCACTC 
AAAAATATTT 
TTTTCAGTCA 
TGAGATAGTT 
TCTTATGTCC 
GGAGATCTAT 
CTCTGGAGTC 
TTCTAAAATG 
TGGAGCTTCT 
ACTATAGTCA 
ATTATCGTTT 
CAGTCAATCT 



TTCAATCTGT 
TGTTTAATAC 
TTAGCAAGAC 
GGCGCCCTCA 
CGTGACCAGG 
AGATACGTCA 
TTCAAGATCC 
CTGTGCTCGT 
TCAATATGCG 
TTTCAAAATG 
CTGCATTGAG 
GCGGACTGCA 
TTGGATAGAT 
AAGTCGAATC 
AACTAAGTGA 
AGGATTCTAT 
ACCATTTACA 
TTTAGACAAG 
AAATCGACTC 
GAGATAGTGC 
TTGAGTAGGA 
TATGTGAAAG 
ATTAGGAAAA 
ATGGTTAATA 
GGCTACCCAA 
CTGTTGTTGC 
CTTTTATCCA 



152 



205151 
205201 
205251 
205301 
205351 
205401 
205451 
205501 
205551 
205601 
205651 
205701 
205751 
205801 
205851 
205901 
205951 
206001 
206051 
206101 
206151 
206201 
206251 
206301 
206351 
206401 
206451 



TAGAGTTAGG 
GCTATGTTTA 
CCCTAAGAAA 
TTGATTTTAT 
ATCTCTATTC 
ACAAGAAAAA 
CAAGTAAGCT 
CATTGGTTGG 
AACCTATGGA 
CCTCTCTTTT 
GGAGAGTTTG 
TTCGGAACTC 
ATAAAGAAGA 
GATTCCCTCC 
GCAATTATTT 
TTGATTCTCA 
CATTCGTCCT 
GGTTTCTCTA 
AACCAACAAA 
CACTTTCAAA 
GTTCGAAAAT 
GAGAAAAAAA 
GTTCGATCTG 
GGGTAAGAGA 
GAATTAGAGA 
ATATCTAAGG 
TAAAAATCAT 



AACTGCTCTT 
TGATTTATAA 
ATCATGGAAC 
TAGAGATCAG 
TTAATAAGAC 
CTCTTACAGT 
CCCTAATTTT 
GACGTCTGGT 
TACTATTGGT 
TGAACGTCGA 
CTCTTTTAGA 
GTTCAAATCA 
GGTAGATGAA 
TTCACCTTAT 
CAATATATGA 
AGGCGAGGGA 
GGGCATTATA 
TTGACCTGGG 
ACACGGGGTT 
GGTTTAGACA 
ACCTTCTTGA 
TCCTAGGTCA 
CATTTTTTTC 
GAGCTATTTT 
TTTTTATTTA 
AGTTGGCTAT 
TGCGAGTGTC 



GTTCTAGTTT 
GATGAGACAA 
TCATCCAAGA 
GAGGTTTCCA 
GAATGTTTTC 
TTGGCATTGA 
GAAGAAATTC 
ATATCCCATG 
GTGGTCCTTT 
TCTCTTCTAT 
AGATGGTCTC 
GACAAAACCT 
GCAGAGTTAA 
TTTTTCTCAC 
ATGAGTCCGG 
TTCCAGTTAT 
CGAAGCAAAA 
AAGAACTGAT 
GCAAAAGATC 
GTACCTAGGT 
ATTATCCTAA 
GCTTGGCAGA 
GGGGATTGTA 
TCATGGGATT 
CACATAATCT 
GAGCATGACG 
ATTCTACCTT 



CTCTTATTCT 
GAACCTAAGG 
ACATTATCCA 
TTTATGAGAT 
GACAAAGCAC 
GAAGTTCAAA 
TTCTACAGCA 
GTATCGGATG 
AGGACTGTAC 
TGTTAAAGAA 
AAGAAAAACA 
TTTTACAAGA 
ACGCTGATTA 
AAGCTCTCTT 
ATGGGATTGG 
CACGTCTGGT 
GAGCAATTTT 
AGAAATGCAG 
TTTGTAATGT 
TCCTTAGATC 
ATACCATTTA 
AATTTTTGAA 
AAAAGTTTGC 
TTTTATTACA 
ATACTGCCTT 
ATCGTTCCAC 
TCCTTTGAGT 



TTTTGCTTCT 
AGTTGCTGAT 
AGTATTGTTG 
ACATCACTTG 
CAGTATATTT 
GATGTACATC 
TTGCCCATTG 
TCACTCCAGG 
GAGAACGCTC 
AATTAGCTTT 
CGTGGAGTTC 
TATTATGCTG 
CGAACAGTTT 
GAAAGCAAGT 
CTTTGTGATT 
TGGGCTGTTA 
ACCTTCCTGA 
TTATTAAGCA 
ATTTGAAAAA 
TAAATCAAAG 
GATAGGGAGT 
ATCCTTAAGT 
CTAATCTGAA 
GGAAATTCTT 
CAATAGATCT 
ATGCTTTATT 
TCAAGGACTA 
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206501 
206551 
206601 
206651 
206701 
206751 
206801 
206851 
206901 
206951 
207001 
207051 
207101 
207151 
207201 
207251 
207301 
207351 
207401 
207451 
207501 
207551 
207601 
207651 
207701 
207751 
207801 



TTGTAAGAAT 
TTAGGCTGTT 
TTTTATTGCC 
GAGAGAAGAA 
CACGTGATAG 
ACAGCAGGTA 
GTAACATATC 
AGTAAAGTAG 
GAATAATTGG 
ATTGGCTTCA 
CTAGGTGTCC 
ATACAGTACA 
TTCAACAATT 
CAAGAGAAAT 
CACTACCTAT 
CAAAAGAGAC 
TCTTTTGATC 
TTGGCTGTGT 
CTCTTGTAGG 
GATTTTGATG 
TGTTCAAGCA 
TCTTGTTAAG 
AGACAGGGTC 
GGTTTATGAC 
GTATCTTCTT 
GGGCTACGAG 
ACCTATCAGC 



AGCCATTGCC 
TGGCTCCTCC 
TTTGTCATTC 
GCTTCCACCA 
ATGAAGCTTA 
ACATTAGCCG 
TCCTGAAGAG 
AGAGTTTTGG 
CCAATATTTG 
GAAATTTATA 
CTAGAGAATG 
GCTAAGGCTA 
AACGAAAGAG 
GGGATACTGA 
ACGGCACGAG 
AATCAGTAAG 
AGCTACAGCT 
TTTGTAGATA 
AGCTTTGTCA 
TAAACCTAGG 
TTTTCTGCTT 
GCATTTGAGT 
TTCACAGAAT 
GTCAATTTTG 
TAAAGACTAA 
AGGGAGTAGA 
AGGGAGGACT 



AGCCTCTTTT 
CGTTTCTTAT 
TTTCTTTAGT 
ACACCAAGAA 
TGGCCTTTCA 
AGTTTAGACA 
AAAATCAAAC 
TATTAGCAGG 
AAGATCTTTT 
TCAGCAGGAG 
TTATGGGTAC 
CAATTTTTTG 
GACGTTCTTT 
TGAAGTCAAA 
GAACTCTAAA 
GAATTGCTAT 
GATCACTCAA 
ACAGTACCGC 
TCCCAAAATC 
CCTGTATGTG 
CTGATGAGCC 
TCAGTTTCTA 
AGCTCTAGAG 
TAACAGGAGC 
ACCAGGTAAC 
ATTTTTCTTA 
CGTCTTCTTC 



GTATAGGTGC 
ATTGTTGGGA 
AATTTTAGCT 
TCATTCCTGA 
ATCTCTGCAT 
ATTTTCTACT 
AATTGCCTTC 
CTCGCAGGTG 
AAGCCAAACC 
ATCCACAAGT 
TATTGGCTAG 
TAAAGAGACG 
TATTAAAAAA 
GCAATTGTAG 
GACCGAAGCA 
TGTTGAGCTT 
CTTCCTAGAG 
ATACAACCTT 
TTCTTGACGA 
ATTCAGGATC 
AAAGAAAGAA 
AGCGATTAGA 
CATGGAAATG 
TAGAATTCAT 
TAGCTTTTTA 
TTCCAAAATA 
AGGTTGGGAG 



ATTAGCAGCT 
GTGTTTTAGC 
TTGATTTTTG 
TAGATTTACT 
TTGTAAGAGA 
GCCCTGTTGT 
TGAATTGCGA 
ATTTAGAAAA 
TGCCCGTTAT 
TTGTAGAGAC 
GGCCTTTGGG 
CATCATATTC 
CAAGGCTCTT 
AGCGTATCTA 
GGGGGACTTA 
GCATGGCTAT 
ATGCTTGGGA 
CAGCTTTGTG 
ATCTTCTATC 
TAAAAGAAGC 
CTAGGTAAAT 
GAGTGTATTA 
CCAGAGCTAG 
AGGAAGACGA 
GTCTCGAAAG 
GAAATCTCTA 
AAGACGCTTC 
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207851 TCTTGAGTGG ATGGACTGTT 

2 07901 TGAGTTGCTA AATTCTCAAG 

207951 GATCGTTTTA AACTGATCTA 

208001 CTTCGCTTTC TTTATGAATT 

2 08051 GAGGCAAATT CCATGTCTTG 

208101 ACAGAGGTTT TCCCATAAAT 

208151 GCTTCCACGA GTACCTATGC 

2 08201 TCATCCTCTT GAAGAGAGGG 

208251 CGCATGATAT TTAGCAAAGA 

2 08301 GATTCAGGTT CCACTCGCCA 

2 08351 ACATCACGAG TTAAGACTCT 

2 08401 AGAGGAATTT GTTGGTCTAA 

2 0 8451 GCATCTCTGG TTTTATATAA 

208501 ATTTTCTTAA CTCGATGGAG 

20 8551 CTCTCCTAAC CAGAAAAATG 

2 08601 AGGGGAAGAG GGGACGGGAA 

208651 GCAGTACGGA CTTCATTTTT 

2 08701 TTTCTGTTGT AGTTCTTCAG 

2 08751 CAGAGTGCAA TTGATTTAGA 

2 08801 TTTGGTTTAG CTTCTGAAAC 

208851 CCTGATTCTA TTTACGAGTT 

2089 01 TCATAGGCGT GATTTTTTTT 

2 08951 AATAATCCTA "ATCCTAACAA 

209001 TGCTACACAC AAGAAAGCTA 

209051 TAACAATATG AATAGTGGTT 

2 09101 GAGCGATTAT TTTGAATAAC 

209151 AGGCTCATCA ATTTGCGTAT 



ATAGGTGGTA AGCTTCTAAA GGTGGAATTA 
GAACATCGCT ATATTGGTTG ACTGACGACG 
AACTAAGGAG TGTCAATGAA GAAAGAAATG 
AGGCCTTGTG TGTAGAGAGT ACCAATTAGA 
GGGTCGTCCT GTATGGTCTA AAGTCAAGCA 
CAGCCGGGAC TGTTTTTATT AAGGACATCT 
TTAAAAATAC ACAAGAGCAG GTTATAGAAT 
TTGAGATAGA TGTTTATAGG ATTGATACGC 
GCTGCTGTTT CATAGTATTT AGATCACTAT 
TTTAATGCTG CATACTTAAG ATGTTGAAAA 
AGCTAAAACT AATGTGTGTA GATTTAGAAT 
AATCTAAAGG AATCAACCAA TAGGTAGGAA 
TCACTAAGGT CTTCTTCGAG GCTGCCTCCA 
CTCTACAACC TTGCTGCCTG CAGAAATAAA 
GAAATACCTT TTGTAAGATT TTTGGTAAAG 
GAAGCAGCGC TTTCAAGTCT TTTAAGAGAA 
TAAGCGTGCG ATACCCTCGA ACGTATCTAT 
ATACGTTGTA ATTTGTAGAT GATCCAACTT 
AGATCAATAA AACTTATGAG ATCTTTAAGA 
AAAATCAGAG ACAAATTTAG GATAGTGCGC 
CTTGGGGGAA TACTTGTTCT TTTGATGAAA 
ATTCCTAAAA TCACACCAAT CAAGGCTATT 
TGCGCCACTT AGAATATAGG AAACAGGAGC 
TCAAAGCTCC GCAGAGTAAG ATGGCACTGA 
GAATTCTTTA ATTCAAAATA ATAATTACAA 
TGGCGAGGTT ATATTACTCA TACATTTCCT 
TATATCGTAA ATTTAACAAA AATCCTATTA 
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209201 
209251 
209301 
209351 
209401 
209451 
209501 
209551 
209601 
209651 
209701 
209751 
209801 
209851 
209901 
209951 
210001 
210051 
210101 
210151 
210201 
210251 
210301 
210351 
210401 
210451 
210501 



ATAGAAATGT 
TGATGAAAAG 
AGCTTGTTAT 
CCCGATTGGG 
ATTTCTAAAT 
GTATTGCAAA 
TGCTTTCTAT 
TGAGGTAAGC 
TCGTTTTAAC 
CAAGAGCACT 
ACCTTCAATT 
GGATACTGGA 
TTGCCATGGA 
CAATTATCAG 
ACCATGAGAA 
TCCGGAAAAT 
TTTTTCACAG 
GAACTTATAC 
AGTGGAAAAT 
CCATAAACTT 
AGTTTTGTCT 
GAATCTCTTC 
TCGATCCCGA 
ACCGCTTTTC 
AGTGTGTTTT 
ATCTTCTGAA 
TGCTACGACT 



TTTTATTTTT 
TTTTCATTGA 
GCTTTTTTTA 
AATAACGTAT 
CTCAATTATA 
TGATCAGACT 
TTTTTTCTCT 
TTGATGTCCA 
AGATTTTGAT 
CGCTGGGTGC 
CTTTCCACGT 
TCGAACATAT 
ACAGTGGCCT 
GATTTATAAG 
AAGAAAAGAA 
CATATTGTGG 
AAGGAGAATC 
TGTGATTCTG 
AGTTGTATAT 
TTTGTGTTTT 
ATAAAATGGG 
GAACTCTGGA 
AAGCCTCTGC 
CAGCAATTAA 
CACGAAATAA 
GTTCGTTGGG 
AGCGTTATAA 



AAAATTTTTA 
AGATAGCTTC 
AAAAAAGGGC 
TCTTTTTATA 
AAAAGTCTTT 
CCCCTTATAG 
AGTTTGTAGG 
TTGTGAACTC 
GGTGCGTTGT 
ATAGGACTTT 
CATGAAGTTC 
TTGTTTCAGT 
CCTGCTTTAT 
TTGAATCATC 
ATAAGAATTG 
GGGAGTTCCT 
CCATTGATTC 
AGATTAATGT 
CCTTTATGAA 
ATTTAATCCG 
AGAGCCAGTA 
AACAGGGTTA 
TTTTTTATGT 
TAAATATTTT 
AAGACTTCTT 
GATGGGCAGA 
TCAAGATTAC 



TTGAAATTGT 
TAGAAGAAGA 
GAGCCCCTTT 
AAAATTAAAG 
CAAACGCAAT 
GGAAGATCGT 
TTTGGGCGTG 
GTATTACGAA 
ATTTACGCAT 
CTTTTACTTT 
ACTGTAGGTT 
ATTCAAAAAG 
CAAACTGACA 
TGAGCCTGTT 
TGAGATTCCT 
TGAATATTTC 
TTAGACGCTT 
TAGTAGAGGG 
ACGCTAAGGG 
ATTTCCCCAG 
TAACGGGCAG 
AATCTATAGA 
AAATTCGGAG 
TAATTCGTTA 
TAGGATAGCG 
ACCTTTGATT 
GATAGCGGCT 



TTTGTATTAT 
ATAACGAGAG 
TTGTTCTTTT 
ATTTTTTTAT 
TTGCTTGTAG 
ATAGGGAGAC 
GAAAGCATAC 
CAAATTGCCA 
ATCTTCTGAA 
CTCTAGTAAA 
CATAGTTAGA 
CCTCCAAATG 
CAACATTTTC 
CCCAAGTGAT 
TGAACATCCT 
TTCGCAGGTT 
TACT AT AT AG 
CGTGTATAAG 
CCCAAGTAAA 
CAACAGATTC 
TTTTGAAGAA 
TTTTAGAATA 
GCAGGTCTGT 
ACAGTCAGGG 
ATTGTAAATA 
TAGCTAGCAG 
AAAGCTAAAG 
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210551 TTCCTCCAAT AGCATAGCTA ATCGGCGCAG CAAGGAAAAC AAAAGAAAGT 

210601 GCGCTAACAA GAGCTAGAAC AAGCCCAAGA ATGAGTCGGG CAATTGTCCT 

210651 AATTTTTAAA GAACAAGATC TATGACATTC GCAGTCATTT TTTAAAAATA 

210701 TATGGGGAAC TAGTGTTAAA GGACTACACT TCATAGTACA ACCAGAATGT 

210751 CTCACGGTTA TTATAACCTT TTGATTGAAA AAAGCGTACG AATTCGTGAA 

210801 ATGGGAGTTA ATAAAAAAAA TCCCTACCCA CTAAAAAGCA GGTAGGGATC 

210851 AACAAGAGTA AGAGAAGCAA CTCTATGAAG AAGCAGGAGC TGAATCTTCT 

210901 TGAGCCACTT CTTGTTCTTT AAGAGCAGAC TGCGCTAAGA ATAGTTTGTT 

210951 TAACTTAGTT GCAGAAACCA ACCAAATAGC AATGATGAAA AGAAGAATCA 

2110 01 CTGCAAGATA AGGGGTCATA GCTCCAATAC TTCCACAGAT AACGAGCAAA 

211051 CCTTGTTGGA TTAAAGCTCC TCCTGATTTT CCGAAGCGGG CAGCAACTAC 

211101 ATCAATAGCA GCCTTACCTT TGACTTTTTG CTCTTGGTCA AGAGGGATAT 

211151 AGGCCATTTC TTTAGTTGAG TCAAAGAGAG CGTATTTTGT GGATTTCGAA 

211201 AGAATATTCT GTATAGCTCC GACAACCACA GCTAGCATGA GAGGAGTTGT 

211251 ACCGAACATA GCGACCAGCC CAGAAGCTTG GTTTCTAAAG ATAACAAGAG 

2113 01 CGAAGAAAAC GATACCTGTT AGGAGAACCA TGACAGGAGT GACTAGGGCT 

2113 51 CCAGTTAACC ATCCAAATTT ACGAATGACG TTACCACCAA CAAATAGCAT 

2114 01 GATAAGTACG GATACTACGC CAGTCCAGAA GGAGAAGTTC CCCATGAACT 
211451 CACTATAGTC ATTCATATTA GGATATTGCA GTTTCAGCTG ACTTTTCCAA 
211501 GTCACTTCGA TTAAGTTAAT GCAAATACCA TAGGCAATAA CCAAGAGAGC 
211551 TAATAAAAGA ATATAAGGAG ATCTAGCAAG ATAGAGGAAG CTATCTTTCA 
211601 TATTCATTTT AGGTTTAGCA CCTTTTTTCC CCTTTTGCAT TTCTTCTGGA 
211651 TTATAGAAGC GAGGATCGGT CAATACGTTC TTATTGATCC ACCAGTAACT 
2117 01 GGCCATAAGA ACAAGTCCAG ATACAATAGT CATAGCCATC AAAAGACGTA 

2117 51 AAGAAATTCC CCAAGGATCT ACACCTTCAG AAACGGAAGC TCTCAACTTT 
211801 GAAGCCCAAA CAATTGCACG ACCAGAAGCT AGTAAAGAAA TATTAGCTCC 

2118 51 GATACCGAAA AGAGCGTAGA AACGCTTTGC TTCGTGGATT TTTGTAATTT 



157 



211901 
211951 
212001 
212051 
212101 
212151 
212201 
212251 
212301 
212351 
212401 
212451 
212501 
212551 
212601 
212651 
212701 
212751 
212801 
212851 
212901 
212951 
213001 
213051 
213101 
213151 
213201 



CATTAGCAAA 
TCAGCAAGTA 
GAGTCCTAGC 
CTGTAGGATG 
GCAAAGAAAA 
ACTTAAAATA 
AGGGGACAAC 
CCAGGAGCTC 
GTTAAATGTA 
GCTCGTGAGT 
TTTTCTTCGG 
TTCTGATAGT 
CGAAAGACCT 
GGATACTCCT 
CTGTAAAATT 
GGGAGAGCTT 
CGCTCACGAT 
GTTAACGGTT 
TTGTCTTGCC 
TAAAGTCACG 
ATTGTGCGTG 
AAGTTGTTCG 
TATAACCGCC 
TTTCCGCAGA 
TTTTATATGT 
GCTTACATAT 
AGAGAAATAA 



TCCCCAGAAC 
CATAAAATGC 
AATCCTGGAG 
TAAAACATCG 
TTAAAAAGGG 
TTACTTAGCT 
AAGCCAAAAC 
CCACAATAAG 
ATACAGAAGA 
ATGTATCGGC 
TTTTTGTCAT 
TTTTTATTTC 
TATAATATAC 
AAGGGATAAG 
TATTTTTTAA 
ACAGCCTTCA 
ACCGTTGCGA 
TTTTAGGGCT 
AGTTCTAAAA 
ACGTTTTTTA 
GGGTTTCCGA 
GGAGCAGCGC 
GCTAGGGCCT 
CTCTTTTAAT 
GATTGTTCTA 
TTTACAAGAA 
AATTAAAGCG 



ATTAGAGATA 
AGCAAATGTC 
GTAGGATGGC 
CGTAGCGGAT 
CGTTCCCACT 
TTGCATAAAT 
TTGATGAAAG 
AGTGTCTTTT 
ACATTAGGAA 
CACAAGAAAG 
ATTTACCCTC 
GTTGTGGACT 
CATCTTTTCT 
AGTTCTTAGG 
TTAAACGAAA 
CAGCTTCGAC 
ATCGAACGAA 
AAAGCGAGCG 
CGCGTTCAAT 
GAGCCTCTAG 
TTCTAAAATA 
TTTTCATCTT 
GGAAGATAGC 
AGCTTTTCCG 
CGGCCATGAA 
AGAATGTGGC 
CTGTTTTAAT 
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GCATGACGCT 
CAGTTTCTTA 
CTGTAAACGG 
AAATTACAGT 
GCATAAAATA 
AAGCATAAAG 
GTATTGCCTC 
GTATCGCGTA 
CATTGGCAGA 
AGCGCAATTT 
TGAAATACTT 
AACTTGAATT 
CTCTATTGAC 
CGACTTTATT 
GGAGAGGTAA 
GTAGGCATTC 
TTAACTCCCT 
AGGAGGTCTT 
ATCTGTTTTA 
GAGCTCTAGG 
AATGTTTTTA 
TTTTAGAGTG 
GACATAAATC 
ATCAGTTTTT 
ACCCACCTTT 
ATACTTTCAA 
CAATCAGCGA 



TCCCCATAGT 
AGATGGCAAC 
TCAGCAAATT 
CGGGAACAGG 
AGGCCTGCTT 
ATAATAGCAC 
TGCACCAGAA 
ACACCGTATA 
ACTTTCTTTA 
TCCAAAAGGT 
TTATTTTCTA 
GTTTCCATTT 
AAGAGGAAAG 
GCTAAGTCGC 
CTTCAAGGTT 
CATAGCTCTA 
TTTTAAAGAA 
TGTCTCCAAC 
GTAAAGTTAA 
CTTAGGATTA 
ACATTTTTAA 
AAATGATGCA 
GTTTTCTTTG 
CTATTTCTTC 
TATAATTGAT 
TTTAATGTTT 
ACATAAATTA 



213251 TATTGCAAGT TCTCCTTTTT 

213 3 01 TTCTTAACAA AATTCAAAAC 

2133 51 CTATAACATT TTGTTTTCTT 

2134 01 AAACTATCCT GAACAAAGAT 
213451 TGCTGGATAA AATATATTTT 
213 501 TGGTTTTTAA TGGATCTTAA 
213 551 TGGGATATAT ATTTAAAGTG 
213 601 GGTATAACTT TTGGATGTAC 
213651 TCCTTGTATA CTATCCATGA 

2137 01 TCGTGGGGAA TAGGCTTGCT 
213751 CCTCATGCGT ATGAGATGGT 
213 801 TGCCGTAATT TTTTGTAACG 

2138 51 GGAAGCATTT AGAAAATAAT 
213 9 01 ATAGCGCGTG GGGCCTTTGT 

213 951 TCATATCTGG ATGGATCTTT 
214001 CAGAAGTTCT CATTGAAAAG 

214 051 AATAGTGAGG AACTTGTTTG 
214101 ACAATGCTTG AGCACAATTC 
214151 ATAATGCGTT CAGTTACTTT 
214201 GTGGCTTCCG GAGCATGGAG 
214251 TCCAGAAGCT CAAATCAGTG 
214301 TTAATGAGCA TGATGTCAGT 
214351 GATGCGTTGA AAAAAATTGT 
2144 01 TCTAGCTCAA AAACCATTGT 
2144 51 GCACCTTTAA ACATAATGTC 
214 501 GCTCTTGAAT GTCAAAGATG 
214 5 51 TAAACTATGA GCATGCAGCC 



GCAATATTTT TCCTCTCAGG ATCTTTTTGT 
ATAAAAATCG TAAAGTAAAG AAGATTTTTT 
ATAAAACAAA CGGGTTTTCT AATTTTTTTA 
ATTTCTTTCA TTTCTATTAG TTGTATTTCT 
GAATAGAACT TGTTTTTCTG GTACTTTAAG 
AAAATGCTTC TAGAGAGATG GATGCGAAAA 
ATGCGTTGGA TTTTCTGTTT CGTGGCATGT 
CAATTCTGGG TTTCAGAATG CAAATTCACG 
ATCGCATGAT TCATGATTGT GTTGAAAGAG 
ACCGCTGTTT TGATCAAAGG ATCCTTAGAC 
TAAAGGGGAT AAGGACAAGA TTGCTGGAAG 
GCCTGGGTCT TGAGCATACA TTAAGTTTGC 
CCCAATAGTG TCAAGTTAGG GGAGCGGTTG 
TCCTCTAGAA GAAGACGGTA TTTGCGATCC 
CTATTTGGAA GGAAGCTGTC ATAGAAATTA 
TTCCCTGAAT GGTCTGCTGA ATTTAAAGCA 
TGAAATGTCT ATTTTAGATT CTTGGGCGAA 
CTGAAAATTT ACGGTATCTT GTCTCAGGTC 
ACACGTCGCT ATTTAGCTAC TCCTGAAGAA 
GTCTCGTTGT ATTTCTCCTG AGGGTCTATC 
TTCGTGATAT TATGGCGGTT GTAGATTATA 
GTGGTTTTCC CTGAGGATAC TCTGAACCAA 
TTCTTCTCTG AAGAAAAGTC ATTTAGTTCG 
ATAGTGATAA TGTGGACGAC AATTATTTTA 
TGCCTTATCA CAGAAGAATT AGGAGGGGTG 
AGACTTTTTG GTCTGTACAC AACCTTTGTG 
GTTCTTTATC ACATATCCTT TTCCTTGGGA 

159 



214601 
214651 
214701 
214751 
214801 
214851 
214901 
214951 
215001 
215051 
215101 
215151 
215201 
215251 
215301 
215351 
215401 
215451 
215501 
215551 
215601 
215651 
215701 
215751 
215801 
215851 
215901 



AAGGGGTCAT 
TCTCTTAAAG 
ATTTTTTTAA 
CCTCAGAGAG 
AGCCCTTATG 
CGGATGATCG 
TCCGTAGCAG 
AGCATTTTTA 
ATGAGTTGTT 
GTTTTGCAAG 
TGACTTGAGT 
AGCGTTTGAT 
ATTTTCCAAA 
GCTCTCTCGA 
TTTTTCTGAT 
GTATGACCAC 
CTTTTAAGCG 
AGCTTTGATG 
TTGTGTTGTT 
TTCTTAGGGA 
TCTTGTGGTA 
AAAGTAGCCC 
GCAGCCACTT 
TGCTTCGTTA 
CTTTTGATAA 
GAAGCACTCA 
AAGCGTAGGG 



TAACTGCTAT 
GCTTCCTTAG 
TCAAAAATTT 
CTAGCGTGGA 
GGGTGTTACA 
AAGGGAGGCC 
ATAGACAAAT 
GCACGTGCTT 
TTCAGCGATT 
AGCTGCGAGA 
CATGTGCGTC 
TTGTTGTGGC 
CGTATGGTTG 
GGAAAACAAT 
ACGATTTTCT 
AGCTTTGTGG 
AAAGTTTATC 
GCGCAATATG 
TGGGTGTGCT 
AAGTATGTAA 
TTCTTTGCTA 
TACGCTATAC 
TAGGTTTTCT 
TTTGCTTTAT 
AGATTTTGCT 
GTCTAATTTT 
ATTGTTTTAA 



TTTAGGTCCT 
GCCTGATCAA 
AAGAAGGTGC 
TTGGGATTTT 
GCTATAAAGG 
TTTCATATTT 
AGGACAGCTC 
TGATGCAAAA 
GATATGGCTT 
TCAGGGAAAG 
AACTATTTGA 
CCTACTGATG 
TGAAATTGAA 
TTGGATCGTG 
TATCTAGTTT 
GGGACAATTC 
TCACGCGTCG 
TTTTCTCATT 
GCTTCGGTAT 
ATT AC AT AAA 
TCGGAGTGAT 
AATCGCATTA 
TGAAGCTACG 
GGTGGTGGTA 
GTTACTTGTG 
TATATCGTTG 
TTTCTGCTAT 
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AATGGAGCTG 
ACCCTCTTCG 
GTCAGCGCAT 
CCAATGACTG 
AATGTGGGGG 
TAGAAAGAGT 
TCAGGAGGAC 
AGCAGATCTA 
CGTTTAAAAC 
ACTATCGTCG 
TCATGTGGTT 
AATGTCTGAA 
CTTTTGGAAC 
CTGATTATGC 
TTTAGCTGTC 
TCTTGATTAG 
TATCCAGGAC 
GCAAGCTTCT 
TTGGTTATGG 
GACTCCGCCC 
TTTAGCCAGT 
ACGCCTATCT 
TTGGCTGCGA 
TCGACAAATT 
GCTTAAAGAC 
GTGATCGTAA 
GTTTGTGGCT 



GTAAAAGCAC 
GGGACTGTTT 
AGCCTATATG 
TCTTAGATTT 
AGAATTTCTT 
TGGTTTGGAA 
AGCAACAAAG 
TATCTTATGG 
ATCTGTAGGG 
TTGTTCATCA 
TTATTGAATA 
TGGAGACACT 
AAACCCTGAA 
TCAGTTGTGT 
ACTTTGATTT 
CAAGCAGCCT 
TTCTAGTTGG 
ATTTTTTGGA 
GATCATTGTT 
TTTGTTTTGT 
TATGTCAAGG 
ATATGGGCAA 
TCGTCTTTTG 
GTTGTGACTA 
TGTTCTTTAT 
GTGGAGTTCG 
CCTTCTTTAG 



215951 
216001 
216051 
216101 
216151 
216201 
216251 
216301 
216351 
216401 
216451 
216501 
216551 
216601 
216651 
216701 
216751 
216801 
216851 
216901 
216951 
217001 
217051 
217101 
217151 
217201 
217251 



GTGCTCGTCA 
TTCTTTGGAG 
CACATGTCGT 
CGGGACCTTT 
CTTTTTTCTC 
CTTTTCGTTT 
TTTCTCATAA 
TATAAGTATC 
TCAGATTTTA 
GACTCACAAA 
AGATTATGGG 
TGTTCATGAG 
ATCATACCTT 
CAAATTATCC 
CCTTATTATG 
ACTGTTTTCT 
AGATTATAGT 
TTTTTAGTCT 
TGTCCTTTTT 
CCCTCTCTTT 
ACAGGGTTTC 
GAGCAGCACC 
TGTTAGTCTT 
GGAAACGCAG 
GATTTTGGCT 
GTTCTTCTTT 
TTGGTTGATT 



GCTTTCCGAT 
GGATTAGCGG 
GCTATTATAG 
GGTTGTCATT 
CAAAATCTGG 
TCAAAGGATC 
TCGTTTAGAG 
AGGAGTATTT 
GAATGGCGGG 
AAAAGGAAGA 
AATCGTATCT 
TTGGCTGAGG 
GACAGAGATT 
CAAATAAAAA 
GAGTATCTTT 
GGAAGTCTTT 
ATTCCTTGCT 
TGCGAAAGAT 
GGTTTGGTCT 
GGGTACCTTG 
TTATTTACTT 
GCTCTAGTCT 
"TATGACAAAG 
ATTCTTTAAC 
AATGCTGTAA 
CGATTCTGTA 
ATTTGATTAT 



CGTCTAAGTA 
AGCTTTAGGA 
GGCAACAGGC 
TGTGCTGGAT 
GTGGGTCATT 
AAGAACACCT 
AACATTAGTG 
TGGGCCTAAG 
GTTATGTTAA 
AGTGAGGCCT 
TGTGAATTCT 
AAATAGAGCA 
CTCAATGATC 
AAAGGAAGTC 
TTTCCAATTT 
TCACGGGTTC 
ATTTCCTGTT 
GGCTATGTAT 
GTGTTTGTTT 
ACTCTTGCAG 
TATTCGTAAT 
TTTCTTTATT 
AATGCTCATA 
GAAAGAGGAT 
TTACTATTTT 
TTTGCCT'CTT 
TTTTCAACTT 



CAATTCTTAT 
AGCTATATCT 
GGTGCCTGTA 
TATTGGCCGG 
CGTTTTGTCC 
TTTAAAGGTG 
TTCGAGATTT 
CCTTTCCCTA 
AAAAGAACAA 
TAAGATTAGT 
TTAGATTTTA 
TGTTCTTACT 
CTTGTTATGA 
TAATGGCTTT 
TTTTCAGTAT 
TCTCTATATT 
CAGGTGCTTT 
GCGAATGCTG 
GTTTACGCAT 
CAATGGCAAC 
ACTTTTAAAG 
ATTCTCTCTG 
TAGGAACGGA 
ATTTTCCCTG 
TGCGTTCCGT 
CTTTAGGAAT 
TCTGCATGTC 



CCTTTCTGCA 
CTGTAGCATT 
ACCTTGCCTA 
TCTATGTTTG 
GTAGGAAGCA 
TTTTGGCATA 
TGTCTGTAGT 
GATGGAGAGT 
GATTATTATC 
TCGTGCTCAC 
GCAAGGAAAG 
GAAGAATTGG 
TCCTCATCGA 
GGGACCTTCT 
TTTTTTCGAG 
GATGATATTC 
TGCAGGAACT 
TCTCTCATAC 
CAACTGACGA 
AGCTATGCTG 
TTTCAGAAGA 
AGCCTTGTTT 
GCTTGTGTTA 
TCACTATTGT 
AGCTTAGTTT 
TCCTATTCGG 
TTGTAGGAGC 
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217301 
217351 
217401 
217451 
217501 
217551 
217601 
217651 
217701 
217751 
217801 
217851 
217901 
217951 
218001 
218051 
218101 
218151 
218201 
218251 
218301 
218351 
218401 
218451 
218501 
218551 
218601 



TTTTAAGGCT 
CGCTTATTGC 
TCGTTAGTTT 
AGCAATTCTT 
TAGTGTTCTT 
CGAGGCTATT 
ATATTAGCAG 
GTTAGAAATA 
TGATTCTCTT 
GTTCTTGGGT 
GCGCTATCCT 
ATCTAAGGTT 
GCGGTCTATA 
TATGCAATTT 
ATACAGTCAC 
GCGATTCTAG 
AGAAATTTTA 
ATGGTATAAA 
TGTTTAGAAG 
TTCTGGAGGG 
CAAAACAAGA 
ACTGTGGACT 
GTATTGGCTG 
CTCAGAGCTT 
TCTATCATGA 
AGCTCCAGAG 
AACAAACTTT 



GTAGGTGTAT 
TAAGGTTATT 
TTAGTATTGG 
AGTGCTTATG 
GACGATGATG 
TTTCTAAAAA 
TGATTTAAAA 
GTATCAATAG 
TTTCACCTAG 
CAACAGGTAG 
TCAGAATTTA 
ATTTTTTCAG 
ACGAAGAGGT 
TTCCTAGGCC 
TACTGTCGTT 
AGTCGATGAA 
GTTTGTGCTG 
AGTTCTTCCT 
GCAGGACGAT 
CCTCTGCTCA 
TGTTTTGAAC 
CATCCACATT 
TTTGGTTTAG 
AATCCATGGT 
ATCCGCCTGA 
CGTTTTGCAT 
AGAATTTTTT 



TAATGGCACT 
GCAAAATCGA 
TACAGCATTT 
ATTTGGGGTT 
TACATCGTGG 
TTTTGAAAAA 
GAATGAAATT 
AGAATTGACA 
TTAAAGGTAG 
TATTGGCCGT 
AAATTATTTC 
CAACTAGAGG 
TTATAACGAG 
AGGAGGGTTT 
GCTGCTTCTT 
AAAAGGAAAA 
GCGAATTGGT 
ATTGATAGCG 
TGAGGGAATC 
ACAAGTCTTT 
CATCCTATAT 
GGTCAATAAG 
AAAATGTTGA 
ATGGTAGAGT 
TATGCTCTTC 
CTCCTAGGGA 
CCGGTAGATG 
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TGCTTTTCTG 
TAAGGAGTCT 
TTAGCTCCTG 
ATCGACTTCG 
TTAAATTTAT 
ATAAGTGAGA 
TTAGAGTTTA 
ATTCCTCGAC 
CATGCTTGAA 
CAAACATTAG 
TATGGCTTCT 
AGTTTGCTCC 
GCCTGTCAGC 
AACCCAACTT 
CAGGAATCGA 
GCACTAGCTT 
TTCTAAGACT 
AGCATAATGC 
AAGAAACTGA 
AGAAGAGCTT 
GGAATATGGG 
GGACTCGAAA 
AATCCTGGCT 
TTTTAGATGG 
CCAATACAAT 
TGGTATGGAT 
AGGAGCGATT 



ATCATTCCAT 
TATGGCTTGG 
CATCTTCGAG 
GGAATCTCTG 
AAGCTATTTT 
AAAGTTCTCA 
GTCCACTTTA 
TTGCGGAGTA 
ACATTTAGCC 
AGATTGTGCG 
TATGGAAATA 
GTTAGCCGCA 
GATTCCCCCA 
TGTATCATGG 
GGCGCTACCC 
TAGCAAACAA 
GCAAAGGAAA 
TTTGTATCAA 
TTCTTACAGC 
TCTTGTGTAA 
TTCAAAAGTG 
TTATCGAGGC 
GTAATTCATC 
GAGTGTGATT 
ACGCTTTAAC 
TTTTCGAAGA 
TCCTAGTATC 



218651 
218701 
218751 
218801 
218851 
218901 
218951 
219001 
219051 
219101 
219151 
219201 
219251 
219301 
219351 
219401 
219451 
219501 
219551 
219601 
219651 
219701 
219751 
219801 
219851 
219901 
219951 



CGTTTAGCAC 
TAATGCAGCC 
CTTGGTGTGA 
GTTTATGCCT 
TAGAGCTCTT 
ATTTTATTCT 
CTTGGTCATC 
TAGCATAGGC 
AATATCGCAT 
ATGGAACGTA 
TGATATTCCT 
TTCTTGTTGC 
AGCATTCTTT 
TAAAGTGGTA 
CTGGAGACGA 
GACATGCTAA 
ACGTCCTGGC 
AGTTTGATCC 
TTGTATAGCA 
AGAGCTACGT 
TCTCAATGGC 
AAAGTAGCAC 
GGCTTCCGTT 
CGCAGTATGA 
CCTTATGTAA 
AGATCCAGAG 
ATCGCATTCT 



AACAGGTATT 
AATGAAGTAT 
CATTTTACGC 
GCCACTCTTT 
GCTCAAGAAA 
AGCAGCCCTA 
TGGTAGTAGC 
TTTGGTCCTG 
TGGATGCATT 
CCAAAGAAAA 
CAGGGATTTT 
TGGTCCTCTT 
ACATGAATGG 
GGTTGGGTCC 
GATTCTTACG 
CAACCTCTTT 
TATTTGACAG 
CACAAAATTC 
ACCAGGTGCC 
CCGAATGATC 
TCAGATATCT 
GGAATGACAA 
TTACATTACA 
GGCTGGACTT 
TCAATAGTTA 
TCTCCTTTGC 
AGCTATTGAT 



AGAGAAACAG 
TAGTGCGGAG 
AAATTAACGA 
AGAAGATATT 
TATAATCGAG 
GCTTTAGGGA 
AAAAGCTGTA 
CTTTATTTAA 
CCTTTTGGAG 
AGGGGAGAAG 
TTAGTAAGTC 
GCCAATATTT 
GGGAAGAAGT 
ATCCTGTTTT 
TGTAATGGTA 
ATTAGAGGGG 
TTCCTAGCAA 
GGGGTTCCCT 
CCTAACGAAG 
GTTTCGTTTG 
CAGATACTCA 
AATCTTCTTT 
CTCCCTACCT 
AAAGGCAAGT 
TGGATACATA 
CACAACCTCA 
GGAACTCCTG 
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GGGTCTTCTG 
GTTCCTTTGC 
CTCTTATGGA 
TTAGAAGTAG 
TAGGTATATG 
TTTTAGTGTT 
GGAATGGCTG 
AAAGCGTATA 
GCTATGTTCG 
GGGAAGATAG 
TCCTTGGAAA 
TATTAGCTGT 
AAAAATTATA 
ACAGGCAGAA 
AGCCTTATGT 
CATCTCAATC 
AGAGTTCGCT 
GTTCTGGAGC 
AACTCTCCTA 
GATGGATGGC 
ATGAGTCTTA 
TCTCGTCAAC 
TCGTAATGAG 
GGTCTTCGTT 
GAAGGTGAAC 
AGAGAGGCTA 
TTTCTGGAAG 



GAAGCTTTTT 
GAAGAGATTT 
ATGTCATAAG 
ATGGTGAGGC 
ACAATAATCT 
AATTCATGAA 
TAGAGAGTTT 
GGCGGCATAG 
TATCAGAGGT 
ACTCTGTCTA 
CGCATTCTGG 
CTTGGCTTTC 
GCGACTGTTC 
GGATTGCTCC 
GGGAGATAAG 
TAGAAATCAA 
ATTGATGTTG 
GAGTTATCTT 
TGGAGAATTC 
ACACTTCTTT 
TGCTTTTGTG 
CTAGGGTATT 
CTTATAGATA 
ATATACATTG 
TTACTGCTAT 
CAGCTTGGGG 
TGTAGATATT 



220001 
220051 
220101 
220151 
220201 
220251 
220301 
220351 
220401 
220451 
220501 
220551 
220601 
220651 
220701 
220751 
220801 
220851 
220901 
220951 
221001 
221051 
221101 
221151 
221201 
221251 
221301 



TTACGTCTTG 
TCCGCAAGAA 
TCGCCTCTTA 
GAGTCTCACC 
TCAGCCTCGT 
AGTTGGAAGT 
TTGGAGCGTC 
GAAAGATCTT 
ATATTACTAA 
CTGAGTCCAC 
TACAGGATGG 
TTAGTATGAA 
GGAGGTTATA 
GAATATGAAG 
TCATCTTCTT 
TAAGGCTCCA 
AAGCAGTGTA 
GTTCTTTTGT 
CGTGCTCGAT 
AGATAGAGCG 
CACGTAAGCT 
TACCATCGTT 
TTCCCACTAA 
TAACATCTTT 
CAAATACGAA 
GACCCTGCGT 
TCGGGGCAAT 



TTCAGAACCA 
CTTGAAGAGG 
TCATTCCGAA 
CAGTAGAAGT 
CCTTGGATTG 
AGCTAAGAAG 
TTGATGCTGA 
AAGGTGAGGT 
GGAAAGTTTG 
AATGGCTTTC 
TCGGTAGGGT 
TTTGGCTGTC 
TCCTTCTATG 
ATTGTGGAAA 
TATTTTTCTA 
TTTTTTTGGA 
TAACGTAGAT 
TTGTCTTCCA 
TATCGGGGTT 
ACTTCCGATT 
TTATGTGGAG 
TTCATAGGGC 
GAATGAGAAC 
GTGGGGCAGT 
CTGTACGAAT 
TCGGAGGAGC 
CAGGGTGAGC 



TCGGGTCTCT 
TGAATTCTCG 
GATCTGTTAC 
CGCGGGTCCT 
ATGTTTATTC 
ATTAAGAACA 
GAAGCAAAAA 
ATAATCCTTC 
ATCACCTTGA 
AGGACCTGTG 
TTTCTGAAGT 
TTGAATTTGC 
TTTGTGGGAG 
GGATTTTGGT 
ACTTTTCAGG 
AATCCGTAAA 
GTCTTGGTTA 
TAACAGTTTG 
GTATTTCCTT 
AGTAATGATG 
TATCTAGCTT 
AGGCGGAAAG 
GTCTAAAGAC 
AGGTAAGAAG 
TTCACACAGC 
CATGAGAATA 
GAGTGCTACA 
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ATTATTGTTC 
AGATGCTGAT 
AAATTTTGAA 
TATCGTCTTC 
TTCGGAGAGT 
AGGATAAACA 
CCATCTTTAG 
ACCTGTGGTT 
AAGCTTTAGT 
GGTATTGTGC 
GCTCTTTTGG 
TTCCTATTCC 
ATAGTCAAAA 
TCCGTTCACT 
ATTTATTTCG 
GGTTGTAAGG 
GGGTAAGGCT 
TTCTAAAGCT 
CTTTTAAAAA 
TAGGTATGAC 
TGTCTCTATA 
GAAGGAATTT 
GAAGGGAGTT 
CTGTCGTCCT 
GGTAGCGTCT 
GCTTTTCTTT 
CGAACAATGA 



AGCAGATGAG 
AAGCGGTTTA 
CCATTTAGGA 
TTGACCCTGT 
TTGGATAAAC 
AAGATACTAT 
GGATTTCTTT 
ATGTTATCAA 
TACTGGACAT 
AGGTTTTACA 
ATCGGTCTAA 
TGTTTTGGAT 
GAAGACGTTT 
TTTTTATTGA 
TTTTTTTGGT 
ACGAGAACCA 
TAAGTCTTGA 
GCTTCGGGAA 
CTCTTTCATA 
TCGTGTGGAT 
GTGCATACCT 
GCTATGTCTG 
TCCCGACATT 
AATTTTCCTC 
AGCTAGTGTA 
TTTGGCTTCT 
CTCCTCCAAT 



221351 


AGAATGAGTT 


ACGAAGTTTA 


TAGGGACTCC 


AGGCTTAAGT 


TPAGPTATTT 

x ^n\j>u x c\ xxx 


221401 


TTTTCAGCAA 


GCGATTGAGA 


TGTTCAGCAT 


GCTTTTPTAP 

\— J X X X X X V_J 


APT A AAPTTP 


221451 


CGCGTCTCAT 


AATTCCAAAT 


AAAGACATCG 


TAATGTTCTT 

■X X i X V^J x X X X 


TTTPTAPAAP 

x x x x riuririL. 


221501 


GCGAGCAATA 


GGTTTTAAAG 


ATGTATAAGA 


TCTTAAAAAC 


PPATPPAPPP 


221551 


AGACCACGGA 


TTCTTTTTGT 


TTTGAGGTTT 


PPTTTAATPP 


PPP A ATTPP A 


221601 


GATGGAAGGG 


TTTGGATTAC 


CGAGGTTTCC 


GAPAATAAAP 


PATTAPPP AP 


221651 


AGCTAAAAAG 


AGTATAGTTA 


ATAAAAATTT 


PTTPATPTPT 

X X \».n. x \j X O X 


PAPPPATAT A 


221701 


ATTATTTTAA 


ATAAGTTTTT 


TATTAAAAAA 


GAATATPTTT 

\jnn x x w xxx 


A TTA A A APTT 


221751 


ATTTATAAAA 


ATAACTAATT 


CGCTTACTTT 

• W X X XXX 


TAAPPPAAAA 


PA A A A ATP AT 


221801 


GTTATTGTTT 


TAGATAATTT 


GCTCAGAAAP 


TTPAPPPTTP 

X X \Jf\\J\^,\j X X v- 


TPPAT APTT A 


221851 


AAACAAGGCT 


TGTTTTTGGA 


AGTTCCCCAT 


PpATATPPPT 


PPA APT A ATP 


221901 


AGAGTCTGAC 


CCAGAGTTGG 


GGPAGPPTPA 


APPAPTTPAP 


PP AP APPTTP 
t-oAtAto lit 


221951 


ATTATCTAAT 


CCAGCATGGA 


TATPATPTAP 


APAPAPTAPA 


ovjooA^AL-o 1 


222001 


GATGAGATTG 


CTTTAGATAT 

W X X X X X 


APPPAPTPAP 


PA APPPTTA a 


P ATTPPP A a 7^ 
Wil IbLLAAA 


222051 


AGACTGTGTT 


TPTGPPPTTP 


APTAPAPA AT 


rppAr a pap ap 

1 oAtiAL-At AtJ 


bLA ill GO 1 1 


222101 


CATAGTGAGT 


AGAAAGTCTT 


PPPPATPAPP 


PPP A A PPP A A 
V~7t- AAL. vjxo AA 


Lj 1 bt 1 1 tt 1 L 


222151 


ATTCGAGATC 


TCTAGGAAGT 


GATATAPAPA 


PPTPTTTATP 
Ov. X o X X 1 jfi I b 


A A AfpTPT^PP 
AAA 1 1 tl Itu 


222201 


GCAACAGCAG 


TTTCAGAAAT 


ATPAPAATTT 


TTA ATT A ZiPP 
X X riJ-s. X X AAot3 


AAL. 1111 AAA 


222251 


TTTTAAGGCC 


AATTGTTCTT 


TTAGGTTPTT 


APAPPA A APT 


TPTTTPP ATA 
Itll 1 <a(jrA 1 A 


222301 


AATCTGAAAG 


TTTCTGACTA 


PAPAPAAAPP 


PTTPP ATPP A 
O X X uun X tjt^A 


HP A PPT 1 A Pprpp 
1 Atjvj 1 Abb 1 \s 


222351 


CCGTGTTTGA 


CCAACTGTTC 


ATPPPAPAPP 


PP APPPTTP A 


Vitj 111 Vat 1 1 (j 


222401 


CTTTTTAAGA 


GAGCATTTCT 


CTGPTPAAPA 


PPPPP ATP AT 


APTAPP AT A A 
AVjj 1 AttrA 1 AA 


222451 


GCAGAGGGTA 


TAGTGGTTAT 

J. ilU X UU X X ^1 X 


PPPATTPAPA 


TAA A APP AP A 
x AAAAot AU A 


rpmmr 71 7. 7. 7, 
1 1 lAbbAAAA 


222501 


GGCGACGATC 


CGCAGGAGCT 


CCTGAAATTA 


GAAGGCGGTC 


TTTTGAAGAG 


222551 


AAAAGCACAA 


TAGGTACTTT 


CCCTATCAGC 


TGCGATAAGG 


TTTTTATAGG 


222601 


AAGTTGGTTA 


TAGCAGATTT 


TTTTTCCTTG 


CTTGTCTGTA 


TAGATGGAGA 


222651 


GAGCTTGGGG 


AAGGTGGTCT 


TTCTCAAACT 


GTGTTTCTAA 


GAAGAAATGG 



222701 
222751 
222801 
222851 
222901 
222951 
223001 
223051 
223101 
223151 
223201 
223251 
223301 
223351 
223401 
223451 
223501 
223551 
223601 
223651 
223701 
223751 
223801 
223851 
223901 
223951 
224001 



GAAGATCCGA 
TCCCAAGGAC 
CATAATTGAG 
AAATTTTTTA 
GACCTAAAGG 
AGGCATGATG 
TATAGGAATC 
AGGATATCTA 
AGAATAATTT 
TGGCTGTTAG 
GAGGACTCAT 
TTCGCGATGC 
AGAAATCTGG 
TCACATTCAA 
TTCATCGGAG 
TATATTCCCC 
GCTAAACGCT 
AGCAAGCAGG 
CGAATGAAGT 
AAACGCAAAG 
CATGCTGAGT 
CCCCTGCTGA 
AATTGAAAAA 
TTTAGCTTTG 
AAACTAATTC 
ATAGGTGTGT 
TAGCTCATTT 



AGGTGATGGT 
AAAACATAAA 
TTTAGGAGCC 
GCTTCAGAGA 
ATTCTCCTTA 
ACAAATAATC 
CGAGATCCCT 
AAAAGAAAAA 
ACAGCCATGC 
AGTGAGCTCT 
TTGTAAATAA 
AGATCGAGTT 
AAATTCTCCA 
CCGCAATCTT 
CACATCTTTA 
AGAAAAACTT 
TTCCGTCAGT 
ACTCCAGTAA 
TCTCTGTAGC 
CATTTTGTAT 
AGGCGAAAGC 
AGAGGAAATT 
ATCTCTTGGA 
GTGACGCAAC 
ATCATTATAA 
TTTGAGGGAC 
CGGGATACAA 



ATCTGTGAGA 
GCGCTTCTAG 
AGTGAGATTT 
GCAGATTTTC 
GGGAGTTTAT 
CTGAGGCAGA 
AAGCTGACTA 
GGGATTAAAG 
TTACCTTTCC 
CCGGGTAAGA 
AGCCACTTGT 
TTACGTTGCT 
GAAAGAAGTT 
ATCTTGATCC 
TAATTTCTTC 
TTATCTAAAG 
CCCTACGATG 
GAACATAGCG 
ATGGTTTTTA 
ATCAGGGAGC 
ATGAAGATCC 
TCTAAATTTG 
GGGAATGGAA 
GTGTGCTCAC 
GTTTCAATCA 
GACACTTTGA 
CGAATTTCAT 
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TGTTGCGTGC 
GAGGTTTGTT 
CTAAATCACT 
ATAAACATCG 
TAATCATCAT 
ATCGGTAATG 
ATTCATCCTT 
GCAATTTCTA 
TTCACCCACC 
AAGAAAACTT 
TTGAGCAGAG 
TTCTGTAGAT 
TTGTGATCAG 
AAGAAGATCG 
TACTGCTTTG 
TAACTTCAGC 
GTAGCCACGC 
GCTTTCTTCT 
GCTGCTCTGC 
ATGGGGAAGT 
CGAGGTGATT 
CCTCTGTTAA 
ATAGCGCCTT 
TGTCAGATCC 
AAACATGGGT 
ATTTTTTTGA 
ATTTTCCTAT 



GAAAAGACCT 
TTTCCTTGGG 
GTGGTTACGA 
GCAGGGTAGT 
GTAGCCTCAT 
ATTCCAGGAT 
ACTATGCTTC 
GGAGTTCGCC 
TTAGTACAGT 
CACGGAGTGA 
TAATTAGTTC 
ATGACGGGGG 
GAGAGTATTG 
CAGCTTCACC 
ATAGGAATAA 
ATCTATTTTT 
CATTGGCGAT 
C TAG AT AC AG 
AGGCAAGGAA 
CTTCTTTTTC 
TGTGCCATTT 
TTCTTTTACT 
TCTCATAGAC 
GTAGCAGTGA 
GAGTACTGGA 
TAAGGTTTCC 
AACCTGAGTC 



224 051 TAACTTTATA TGAAGGGCTA CGCCTCTCTA GGAGAGACGG TAGAGGTACG 

224101 ATCGGGATAT TACTACAAAG TGATTTTCTG GTCTTCTAGA TATTTTACTA 

224151 ATAAAAAAGG CTTTAGATTG AGGAAATCTT CCCTGGAAAT CAAGGAAAGA 

2242 01 GGATTCTAAT CATTGTCTTA AGAAGGAAAA ATTGCTTTTT GTTATACTGG 
224251 TTTTTGTCCC CAATTATGGG GGAGGATCTT ATGGCACAAA AAGAAATTGT 

2243 01 TTCTAATCGC AAGGCTCTGC GTAACTATGA AGTTATAGAG ACTTTAGAAG 
2243 51 CAGGCATCGT TTTGACTGGG ACTGAGATTA AGTCGTTGCG CGATCATGGG 
224401 GGAAACCTCG GTGATGCTTA TGTCATTGTT TCTAAAGGTG AGGGGTGGTT 
224451 ATTAAACGCG AGTATTGCTC CCTATCGGTT TGGAAATATC TATAACCATG 
224501 AGGAGCGTCG TAAACGTAAA CTCCTTCTTC ATAGATATGA ACTTCGTAAG 
224551 TTAGAGGGTA AGATTGCTCA AAAGGGCATG ACTTTGATTC CTCTGGGAAT 
224 601 GTTTCTGAGT CGCGGCTATG TTAAGGTACG TTTGGGTTGT TGTCGTGGGA 
22 4651 AAAAAGCTTA TGATAAGCGT CGTACGATCA TAGAAAGAGA AAAGGAACGT 
22 4701 GAAGTTGCCG CTGCTATGAA GAGGCGCCAT CATTGATATA GGTTAGGATA 
22 4751 TGGTGTTCTT CAGCCCACTG TTTTGCTTCT ATTTTAGAAT CAAAAGTCAT 
224801 GAGGACTGTG GCAATAGCGT CGGCGTATGC GCAGCTCGGA TGGACTACTG 
224851 AAACACTTTG GATAGGATAG GAGCTTAGCT CTAGGGGTTT CCCTGTACGA 
224901 GTATCAAGAA TATGGGTGTA AATTTTTCCT TCAACACACC ATTTTTGAAT 
224951 ATGATTTCCA CTTGTTGCAA TTGCCATATC ATCGATATCT AAGATCGTAC 
225001 CTGCTGCTTC AGAAAAAATA CGCCAAGGTC TTCCCGAGGG ATGATGCCCT 
225051 GACGTTTTGA TCTCTCCTCC CCACTCTACA TAGTTGTTCG GACAAAAGGT 
22 5101 ATTGCAAATT TCATTTAGAC AATCTACGGC ATAACCTTTG ACAACACCAC 
225151 AGAGGTCGAT TTGAACATGA GGATTCTTTT TGATTAGAGT TTTTGTGTTT 
225201 GACTGAAACT CCAAGTGTTG CCAGCCCATG TCTTTATAAT GTTGTTCCCA 
225251 AACGTCTTTA GGGGGGAGGG TTTGACTTTT GAGATGTAGA AGCCATAGGG 
22 53 01 TTTTTAAAGG TCCTACAGTA GGGTCA^AAC GTCCTTCTGA AAGTTTGTAA 
2253 51 AGTGTATCTA CCTGATCTAG AAACTCGGAA AGTTCTACAG ATAAAGTTAT 
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225401 
225451 
225501 
225551 
225601 
225651 
225701 
225751 
225801 
225851 
225901 
225951 
226001 
226051 
226101 
226151 
226201 
226251 
226301 
226351 
226401 
226451 
226501 
226551 
226601 
226651 
226701 



GGGGACATCT 
TCCAGTTGTT 
TGGGATAAAG 
GCGATAGAAG 
AGCATGAACA 
ATCGCCATGT 
CCAGGAACTG 
TACATCCCCA 
TTGTTCCTAC 
ATAAAAAGCG 
GATTTCTGGG 
TAGTTTGAGG 
ATGTTGCTTC 
ATAATAGTTC 
CATCAAAATT 
TCCACATCTT 
GTGTTTGGGC 
GATTCAATCG 
GGTAACTTGT 
TTTCATGCCA 
CAGCAAGCCC 
TGAAGGATTT 
CTAAAAAAAG 
TATATAGATT 
GAGGTATCAT 
GCGCACATAG 
CACAATGACC 



GCTGGAGCTC 
ATAAATCGAG 
ATGCTTTTTC 
ATTGTCATCT 
GAGTCCAAGA 
ATTGCTCATG 
GAGTGATTGC 
AGAAGAGTAT 
ATCTACGATC 
GTGCTCCTAG 
AGGTTTTCCG 
ATGCTTTTGC 
TCCCTACAAT 
AGGAGTTCAA 
TCCAAGGAGC 
TGTCTGGGGA 
AAGGGAAGTT 
TTCTATGAGC 
GCGCTTTGGA 
ACGTACACCT 
CGGAGAGGTA 
TTTCAGCTGC 
ATATCATTTT 
ACAGAAGTGA 
TAGCACATCT 
GAATGGTTAG 
AAGGCGGCTT 



GGTTGATTAT 
TCGATCTTAT 
TTTTGCGGAT 
GCTCTCCTTC 
CATAAAAGAA 
AGCATAGCGA 
TGCGCATTTT 
AGCCTTTCGC 
ACAGCATGTG 
AGCAGCAATA 
ACTGGCTATG 
ATCATGAGGG 
AGCGGCATGG 
TAATTCCTGC 
AACTTTCCCA 
GATCGCTTGG 
GCACGAGGAT 
TTAAGGACTG 
GATAATTCCG 
CAGATGCGGG 
GGACTTTGTG 
AGGAATCCCT 
ATGGATTCCT 
AGATGAGATT 
ATTGACCAAC 
GAGTAAAAAG 
TGCTCTCTTT 



CGAGAGTTCA 
GAAAGCATCT 
AAAGAGGTTC 
GATTGTTGTC 
CTAAGAAAAA 
CAGTCATGGG 
GTCACAACGT 
ATTGTCTGCA 
GGGCTACCAT 
ATGATATCAG 
AAGAACTGTG 
CCGCTAAGGG 
CGGCCTCGAA 
AGGAGTGCAG 
TGTTCACAGG 
AGAATCACTT 
GCCGTGGATG 
AGGAGAGGGT 
ATTTCTGTAG 
GTCATTGCCA 
AGATTTCCTC 
CTCAGTAACA 
GAGTTTATTG 
ATCTAAGAAA 
GGACTTCATG 
ACTCCATAGA 
TGAAAAAATG 



GAATAGGGAT 
ATCAATTTGT 
CCAGAACAAT 
GTTTTTTGAG 
TTTTGGTAAC 
ACCAACGCCT 
TATTAAAATC 
GGGACTCTTG 
AGTTTCCTTT 
CTGTCTTTAA 
ACTGTACAGT 
TTTCCCCACG 
GAGGAATTTC 
GGTAGAAGTC 
GTGAAGCCCG 
CGCTGTCCAA 
CTAGGATCTT 
AGAGTCAGAG 
CTTTTTTGAC 
ATCAGGACCA 
TTTGAGTCTC 
TACCAATCTC 
CGCAAGGAAG 
GAACATTTTA 
TCCGAACAAG 
AAATCATATT 
AGGTGTGCCT 
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22 6751 TATGGGTGAT CAGAGCAGTT AGAGGGTAGA GCAGTAGAAA TACACCGCGG 

22 6801 CTTCCTAAGG CAAGATCACT TAAGACACCA CAGCCTAGAG CGAGTACCAG 

226851 TCCCTTATCT TTTGAGAAAC AATAGAAACA AAGAACAATA TAAGGGGAAA 

226901 AAAGGATCAG ATGTGCTTTA GGGAAGAGCG TTGGAAGGGT GAACATGGAT 

22 6951 AAAATACACA AAATAAAGCA TTTGCGCGGT GTTATGTGGA CAGACATGGT 

227 001 TTACGCTTTG AATTATGATT ATCGATTTCA AAAGGTTTTG CAGACTTGTA 

227 051 TATAGCAAAG CATAGGATTC TTAGCTTGAT TTTTTATGTT TATCTAAACT 

227101 GCTTTTAAAC AATTATTTAT ATAATATTTT ATTTTTAGAT TAAAATTAGT 

227151 TTATTCTTAA TAAGTTTTTT AATTAAAATT AGTTTATTAA GTTGATTTTT 

227201 TTGTTTTTGT ATTTTTTATG GGGAAGCCTA AGAAGAGCAG AACGGATAGG 

227251 GCTTTGGCTC AGGAGATTCA AAAGAAATCA ACGGAAGTGT TGAAGAAGCC 

2273 01 TGCGCGGATA AAAGCTAAAA ATCGTCGTAA ATTTCTTATT GCTAAGGAAC 

2273 51 AGAAAACTCT TAAACACCGT GCTCAAGAAT ACGATCAGTT AGTTCGCTCT 

2274 01 CTCTTAGATT CTCAGAAGAA GGACACCGAT AAAGTTTTGA TTTTCAATTA 
227451 TGAGAATGGG TTTGTTTTTA CTGACAAGGA CCATTTTAGT AAGTACTCTA 
227501 TCCGTCTTTA GGCGGGGTTA AATTAAAAAA AAATCGCAGA ACGGGACTAT 
227551 CCCGTGATGG AACGACAGGA AGCCTGGGTA ATTAATTTAG AGTTAGGAAC 
227 601 TGTCTCGGTA TTAAGATCTC CAAAATCTTA ATACCTAGAC AGTCGGATCC 
227 651 GAGTGATCCC TCTAGCTCGA CAGCGATGCA CTTTGTTGCA TCTACTCGCT 
227701 GTTCTCTGTC CTCCCATAAG TTTTTTTACG CAAGGTGTTT CACCTTGCGT 
227751 TTTTTTTTGT TTTTTAGATT TTTAAAATTT GTTTGTAAGT TCTTTTTTTT 
227801 AAATTAATTA TTTTTTGTTT TAAAATTATA AGTAGTTATA AGTTTTATTT 
227851 TTTAAATTCA GAGAATCATT ATGGCAGTTT CAGGTGGCGG AGGGGTTCAG 
2279 01 CCTTCTTCGG ATCCAGGAAA GTGGAATCCT GCTCTGCAAG GAGAGCAGGC 
2 27 951 AGAAGGCCCG TCTCCGCTAA AAGAATCTAT ATTTTCTGAA ACCAAGCAGG 
22 8001 CCTCCTCTGC TGCGAAGCAG GAAAGCTTAG TGCGTTCAGG ATCTACAGGA 
228051 ATGTATGCAA CAGAATCTCA GATAAATAAG GCTAAGTATC GTAAAGCTCA 
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228101 
228151 
228201 
228251 
228301 
228351 
228401 
228451 
228501 
228551 
228601 
228651 
228701 
228751 
228801 
228851 
228901 
228951 
229001 
229051 
229101 
229151 
229201 
229251 
229301 
229351 
229401 



AGATCGATCA 
AAATGCGCGC 
TCGAGAGTGT 
ATTGCCGACA 
CTGAAATGCA 
TCTGATATTT 
AGGTGCCAGG 
CGTTTGGATC 
AATGCTTGGA 
CGATCCCAAT 
GCAACGAAGG 
ACAGCCATGA 
TTCTGATTCT 
TAGAAAGGGC 
AGTGAGGATC 
GGCAGCGCCT 
CAGTATTTCC 
GGAGATAAAT 
TACGAACTTT 
AAAGTTTGCC 
GTTTCCAGGG 
AAATCCAGAG 
TTTTCCCTAA 
AGTTCCTCAT 
TCTACCTCGT 
GTCCTGGAGG 
GTTGAGCCGC 



TCAACCTCTC 
TAGTGTGCAA 
CAGCAAAGCG 
GAGATGGATG 
GGGATTTTTC 
CTCAGCTTTC 
AGTTTAAGTT 
TTTCCAAAAG 
CAGTGGCTCG 
GTTGAGACCT 
CATGATAGAT 
CATCTCCCAG 
CCAGAAGCGA 
GGAAAAGGAA 
AGATGATGCT 
CAAGAGGTAT 
TCCTCCCAAG 
CAAAGCATAA 
TCTCCTCTTC 
TCATCCAGAA 
AGGAACCTGA 
AATAGCAGTC 
GGAAAGTGGT 
ATCATTTCCT 
GCTACTGATG 
TCCTCCAGAT 
CAATTGTTCT 



CAAAATCCAA 
GGATTCATGT 
TGCTTCCGAT 
TTGCTCTAAA 
TTAGATGCTT 
TTTAGAGGCT 
TAAGCTCTTC 
GCCATAGAGC 
TTTAGGAGGG 
CATCATTAGT 
CTTTCTGATT 
AGCAGTAGAA 
ATCCAACAGG 
GCAGAGAAAC 
TGCACGTGCT 
TGAGTAATTC 
TTTTCAGGAA 
ATCTCCAGGA 
GGGAAGGTAC 
AGTATGTATC 
AGCCGTTGTT 
AAAACTTTCT 
ACGGGAGGGG 
TGCGCAACGT 
ACTATAAAGA 
CCTTTGATTT 
CCGTTCTCCC 
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ATTGAAAGGT 
CAGGATTCGG 
AGTGGTGAGG 
GAAGGGAAAC 
CGGGTATGGG 
TTGAAATCTT 
AGAATCTAGT 
CTATGAGTGA 
GAGATGGTCA 
GCGCAGGGCA 
TAGGACAGGA 
GGAAAAGTAA 
AATTCCAAAT 
AAGAAAGTCG 
ATGGCTGGGC 
TGTTTGGTCT 
CTTTACCCAC 
ATAGAGAAGA 
TGTGAAGAGT 
GTTTTCCTAA 
AAAGAATCTA 
CCCTATTGCT 
CTTTAGGAAG 
GGAGTGTCTT 
GAAGCTCGAA 
ATCAGTATCG 
CAGCCGTTTT 



ACATTTTCTA 
ATCTCGGGCT 
GAACATCCTT 
CGTATTTCAC 
AGGGAGTTCC 
CAGCATTTTC 
TCCGTGGCTT 
GGAGAAGGTA 
GCTCTCTTCT 
ATGGCAACAG 
AGAGGTCAGT 
AGGTATCTTC 
TCTAATACTT 
AGAGCAGTTG 
TTCTTACAGG 
GGTCCTTCTA 
CCAGAGATCG 
GTACGAACCA 
GCTGAGGTTA 
AGATAGCATC 
CGGCATTCAA 
GTGGAGAGTG 
TGATGCTGTG 
TACTCGCTCC 
GCTCATAAAG 
AAATGTTGCT 
CAGGATCTTC 



229451 ACGTCTATCG GTTCAAGGAA AGCCTGAAGC TGCTTCAGTT CATGACGATG 

229501 GTGGGGGGGG AAATAGTGGT GGTTTTAGCG GAGATCAAAG AAGAGGATCT 

229551 TCGGGCCAGA AAGCTTCCCG TCAGGAAAAG AAGGGAAAAA AATTATCTAC 

229601 GGATATTTAG GGTTTTAAGT CGGTTTGATG TGATGCGTAT AATTCGTTTT 

229651 GATCCTTATG GTGCGCTATC TGCACAAAGC ATAGCTAAAG ATTCCCGTCA 

229701 AAACTCTCCT TTAGTAGAAA AAATTTCTGA GGAAATTGCT ACGAATGAAG 

229751 CGATTCGGCT TGCCTTGCTA GCTATTGGAG ATCGCGAACA AGAGGAGAAG 

229801 AAACAGAGGC ATCGTTATAA GCTACTCGGA CAAAAGCAAG CCAAGGTCTT 

229851 GCTTTCTCAG TTGCGTCATG TGCATTTAGA TTTTAAAAAA CTATATTGCG 

229901 ATAGTAAGAA AAAAGAAGAT CAGGAAAAAG ACGAAAAAAA CAAACAGAAG 

229951 CGATCTATTA AAGTTACAAA GAAAAAAAAG GGCATCTCTT TAGGGGCTGC 

23 0001 CGCTTCTCAG GCAATTGCAG CAGCAGCAGA AGCTTGGGTA ATTGCTAGAA 

23 00 51 ATAAAGGAGT CTTAGAAACT GCCTCCACTC TTTTTTATCA AAAGGATGAA 

23 0101 GAGGCCTAGA CATACTTGAA TCACGAGCTA GGCCTTCTTC TGTGACTATT 

23 0151 TTAGATCATC ACGCTGCTTC TTGCTCTTCT ACTGGAAGAC TTTTTTCTAT 

23 0201 AATTTCTTTT TCTTCATCGT CTACTGTAGG ATGTTCTGAA TGCTTAGCTA 

23 0251 GATCTTTCCA AATCATTCGA AGTTTCTGAT TTTGTTGTTT GGTCAGGTTT 

23 03 01 TCCAGTATGA TCAAGCTTTC ATCATTTAAG GAGAATCTTC CTTTAGACCA 

23 03 51 ATTTATAGAT CCTGCAAGTA GAGTTTTATT ATCTATAACT GCAAACTTAT 

23 04 01 GGTGAAGAGT ACAGGGTGCG GTATTTATAG AAACAAAGTC TTTATTGATA 

23 0451 TTTAATTGTC GTAATTGCTT AAAAGTAAGT TTGCTATGAC TTCTATCAAT 

23 0501 GATAATATCT ACATGGATTC CTCGTTGTTT TGCTTGATGT AAGGCTTGAA 

23 05 51 TAATCTCCGA GTGGGTCAGA GCAAACATAG CAACTTGGAT GGTTTTCTGA 

23 0 6 01 GCTGTCTGGA TTTTTTCGAG TACAGCTTGT ATTGCAATTT TACGATCTTG 

23 0 651 AGGAAGAACA AAATACTTTC CTGTTTGATC CTTTATAGAA AAGTCTCCAG 

23 0701 AGGTATTTGT GATAATGAGA TCACAGAGCT CCGAGCTATG CATTCCTAGA 

23 07 51 ATGAGATTAT TATCTAAACG TAGAGAAAGA TTGGTGTAGT TCGCAGATCC 
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230801 
230851 
230901 
230951 
231001 
231051 
231101 
231151 
231201 
231251 
231301 
231351 
231401 
231451 
231501 
231551 
231601 
231651 
231701 
231751 
231801 
231851 
231901 
231951 
232001 
232051 
232101 



TAGCCAAGCA 
TACGCCCTGC 
AAGATTTGGG 
TGCTTGAGCT 
TATAAATACG 
AGAATTTTAC 
TTCAGACTTT 
AAATTCCTAC 
TCTTTTTGTC 
TATATATTTT 
ATAAGATTCT 
GCCATCGTTC 
TTAGGATAGG 
ATTAGGACGG 
CTGGGGCACA 
ATGGTTTCTC 
ATCTGGATGG 
CCAACACTAG 
ATTTTTCCTC 
TCCCATGAAG 
CTAACGTATC 
TGTAGACGCT 
CACCTCTTGG 
CTCCAAGAAT 
TTAACGACAG 
AAGTCTACGC 
CTTGGAGTTT 



TCTTTCTTAT 
TGGAGGTTGC 
GAATTTTAAA 
TGTCGAGTTA 
TAGGAAGATC 
GCATGTCCTC 
AAAAAAGTCT 
TAAAATCAAC 
TTTTATTCAT 
TAAAAATAAA 
TATTAAATAT 
GGAAACCTTT 
AAGGCTTCAT 
TGAGTAGCAA 
TCGTGTGAAG 
TCAGATTATC 
TGAAGGAGAG 
GCAACCACGA 
CCTGATATCC 
ATTCCCTTGG 
GTTTGCAGTT 
CTAAAATGGT 
TCTGTGACCA 
TCTAAGAGCT 
CGGCAGTCAG 
ACACCAGGCA 
GTTCCCTATA 



CTATGGAAAG 
TCGACTAAAG 
TTTTTGATAG 
AACTCTGTTG 
TCTTCATCAG 
ATTGCATTGA 
TAAAAGTGTC 
GTGCTAATAA 
TTTTTACTCT 
AATAACTAAT 
AAAAAGTTTC 
TGTTTTGTAG 
CAGGGCATCG 
AATGGCTGTG 
GTCCTCCAGA 
GGCAAGGATA 
ATTTAATACA 
CAAAAGGGAG 
ATGGGGAAGG 
AGCCCTTATT 
TCTGAGAAAA 
TTCTAGAACC 
TTAGGAATTT 
GTGGTTAGAG 
TGCATGAAAC 
TAACTAACGG 
TAAAAATCTT 
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AGCTTTTTGA 
TTACATTGCT 
TAGATCGTAA 
GATCTTGGGT 
CGTGTTCTAT 
TTTGAGTAGA 
ACCACGAGGA 
TAACACAGAT 
TAAAATAGCT 
TTAATTAAAC 
TTTATTGGGG 
ATGGGTCGAC 
ATAACGAAGG 
GAGCGCGTGA 
GAAAATCTTT 
ATCAGAGGCC 
TCGGTCCTCG 
CGATGTCTTG 
TCTCGGATGG 
TAAGCTTGGT 
TAATAAGATC 
ACGGAGAACC 
CGTTAGGGAA 
ATTCTCTCCA 
CCTGATTCTC 
AAATAAAGGG 
CTTGGTAGGG 



TGCATCAGTT 
GGCTTGCTTT 
CTTTGTTTTT 
TCTGAGAGGT 
AGCATCGCAT 
TGATAGCTTC 
GCTCTTGCAA 
TTTTAATTTA 
TTTTTGTTTT 
GTTTAATAGA 
AAATATGCGT 
CTCTACTTCT 
GAAAGTTGTA 
AGATCATTTG 
TTCACTTTGA 
ATGATTTTAG 
AGGGATGTTT 
AACTCCATGG 
CTTTTCCTAT 
CCTGTATAGT 
TCGGTCTGGC 
TGTCGAGAGG 
AGTTGGCCCT 
ATAGCGTTCT 
CGTAACTTTT 
GAGAGGTATT 
TTTGCCGACT 



232151 ACTGTAGCAG GATAGATTGC ATCTTTTCTG TGATAGATTT TATGACAGTG 

232201 GAATTCAGGG AAGTCATGTT GGAGACTGTA GTATCCGAAA TGATCGCCAA 

232251 AAGGACCTTC AGGACGACGT TTCCCGGCCG GAGATTCTCC GACCAGGATG 

2 3 2301 AATTCCGCAT CGTAGAGTAG AGGGTGGGGA TGGTCGTTTG TTTTTTTATA 

2323 51 AAGGAGCTTC GCTCCTTGGA GGAAGGTAGC AAAGAGAAGT TCCGAGACAT 

2324 01 TCTCAGGTAG GGGGGCAATC GCAGAAAGGG TTAAAAAGGG GTTTCCAGAC 
232451 AAAAATACCG AAACAGGAAG GTTTTGCTTT TTTTGCTCTG CTTCATACAG 
2 32501 ATGCATCCCT CCGCCCTTCT GGATTTGAAA ATGGAGGCCC ATGGTGTTTT 
232551 GATTGAACCG TTGCACGCGA TACATCCCAA GATTGGGTGT AGTAAGAGTC 
232601 GGCGATTCCG TATAGACAAG AGGAAGTGTG AGAAAGGCTC CACCATCTTC 
232 651 AGGCCAGCTT GTGAGTAAGG GAAGGTGATC TAAGTTAACT GAGGACATAG 
232701 AAACAAAAGG AAAGCGACGG AATCGAGCTT TTTTGAGCCC TAAAGAGCTT 
232751 ATTCTTTTTA ATAGATCCCG AGATTTCCAT AGAGAAGAAA GCTTTGGTGT 
232801 AGAAGAAATA AGGTGGGCAA CTCGAGCGAT GAGGTTATCA GGAGCTTGAG 

232 851 AAAAAAGTTG GTCTACACGA TGTTTTGTTC CAAAGAGATT GGTCAGGACT 
2329 01 GGGAATGACG ATCCGATGAC ATTATGAAAA AGAAGGGCAG GGCCTTGATC 
232951 TTCAATAACA CGACGATGAA TCTCAGCTAA CTCGAGGTTA GGACTTACGG 

233 001 GAGCAAAAAC ATCAATAAGT TGTTTTTGTG AACGAAAAAG AGAAATATGA 
233 051 CGCCTTAAGA AAGACATAAA TATTTCCCTA TACTTAAGTT AAATTAAAAA 
233101 TTTTTACTTT TAGCTCTTTC GAGAACTTTC TCTAATCCGA GCTTATCAAT 
233151 GTGACGAAGA GCGCTAGCAG AAATTTTAAG CTTAAGAAAA CGGTTTTCTT 
233201 CTGTAGACCA TAGACGCTTG GTCAACATAT TAGGGAAAAA TCTTCTTTTA 
233251 GTCTTCCCTG 'TCACTTTCAA ACCAATTCCT TTTTTCTTTT TAGCAATACC 
2333 01 TCGAAGTGTA TAGCTATAAC CACGGCGAGG TCTCTTTCCT GTAAGTGGGC 
23 33 51 ACTTTCTTGA CATATTCTTC CTATGAATTC TCTAACATCC GCTCAATAAA 
23 3401 GCAACTCTAT GAAAAGGAAT CT AT AC TAG A GCTTTTCGAA TATATGGGGA 
233451 AGAGGTTTTC TTAGAAAATG TGATTGTCTG TTGTTGATAA TGAAGGAATT 
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233501 
233551 
233601 
233651 
233701 
233751 
233801 
233851 
233901 
233951 
234001 
234051 
234101 
234151 
234201 
234251 
234301 
234351 
234401 
234451 
234501 
234551 
234601 
234651 
234701 
234751 
234801 



CTTTTAAGAA 
AAAAAGACTT 
CTCAATGTAA 
GCCGAACTCG 
ATGCGTTCTC 
ATAATCGTTA 
GTTTTAAGAT 
AGATGTAGAA 
CCATTGGGCA 
GTGGATTATA 
CGTTCCCATC 
CGTTTTGACG 
TAGAAGAGGC 
TAGTCTTTAG 
AATCCACAAA 
AGAAGTTTTG 
TTGGCAAGTT 
AGAATTGTAG 
AGTCTCGGAA 
AGGTCTCCTT 
GCAGAGCTGT 
CATCATGGAA 
GACTTCGGCC 
ACGAAAGGTC 
CAGAAAAATT 
TACTCTCTTA 
TGATGGAGTC 



AGACGGTCGT 
AAATGTTTTA 
CCATTAAATT 
ATACGACCAA 
TTTGTAGATT 
AAGAGGTTGA 
GTCTATTTGA 
ATTTAGCAAA 
AACGTATCAG 
ATCTTTTAGG 
GTGGAATCCG 
TCTTGGGGTA 
TCCGAGCATA 
GATTGTCTGG 
CGGAAAAATC 
AGCATATCGC 
GTGAAAAATT 
AGGTCAGGAG 
ATACCAAACA 
TAAGCAGGAC 
TGGTAACAGA 
TTTTTTAGTT 
AGTTATTAAT 
CCATAGGGAT 
TGAGTTTCCT 
AGAATGCCCA 
GAGCATAACT 



TCCAGGAAAT 
TTGCTATCCT 
TTTTATGAAT 
TTCTTTTTAG 
TTTTGATACT 
TATGAAAGAT 
GTTTCTGTAG 
TTGCTTAGCT 
AGTCGTGGGT 
GGAATGAAGG 
GGTTCCACAG 
TAATCCCTAA 
GTAGAAAGGA 
AATGAACCTT 
CTATAATATG 
AGACGCTCTT 
ATAAATAGGC 
GAGCTCCTAC 
TCACAGCTAT 
GTGGTGTTGA 
GAAACTGTAG 
AAGTCCGGAA 
AGGTTCTCCG 
AAAGCCAATA 
TCGAGGGAAG 
TTTTTTTTCT 
CATGCATATC 

174 



TATTATAAGG 
TACAGTCCTG 
AGCGAGTTCT 
AAATTGTCCC 
AAATCAGGGC 
AGATGCTGAT 
TCAGGGTTTT 
TCCTTAGGTG 
AGAGAGAGTG 
CACTGTCGCT 
ATTCCTAAGT 
ATCTTCTCCG 
TCTCCGTGCC 
CCTCTTCCTG 
ATCTAAGCGA 
TCCACCAAAT 
AGATGCCAGT 
AGACCTTGAT 
CCTTGCTAAT 
TCTGCATAGG 
ATAGGAAAAA 
AATTCTCCTG 
TGCATATGAT 
GCGCTCGCTT 
ACTTGCAACA 
TTAACTTGAG 
TTGAAGTTTC 



TTCTTACTAT 
TAAGGATCTT 
TCTAAGGAAG 
AGGTGTATTA 
AGAGGGCGAG 
TCATGAGAAA 
TTGAAAAGGA 
AATTGAGCCA 
GTCACAGAAA 
TTCCCAGTTG 
GTGTTAATGT 
ATAGGTAACA 
CTGCTTTATA 
AAGAATCCCA 
TAGACGGAAT 
GTAGTCGTCT 
TTTGTCCTTC 
GAAGAAAAGT 
AAGAATAGGG 
CTTTCACTTC 
AAGAGGACTT 
ATCTGTGAGC 
GTTTGATTGC 
TCTAGAAACT 
TTTTTGGTAG 
TATAGCTGAC 
TTGGCAACTT 



234851 
234901 
234951 
235001 
235051 
235101 
235151 
235201 
235251 
235301 
235351 
235401 
235451 
235501 
235551 
235601 
235651 
235701 
235751 
235801 
235851 
235901 
235951 
236001 
236051 
236101 
236151 



CAGGGATGGT 
TTCAGGGCTA 
ATCATTTAAA 
AAGAGATCAG 
TTTGTGTGTA 
TCCTATAAGT 
AAACATTCAC 
TCCCTGTTCT 
TTGAGATACT 
ATCTGAAAGA 
TGATGCTGGA 
CGCTCGCGAA 
AAGATCTCCG 
CGCTGAAACG 
TTTAATAGGG 
GGCAATCCTT 
GCTTTGGGAT 
AATCATAGCT 
ACGACGTTTG 
GAATCAACAT 
TGCTGTTGGG 
TCACAAAGTT 
TCGAGTAAAA 
TGCTTGCAGG 
GTACGTAGGG 
AAGGAGCTGA 
AGAAAGTAGC 



ATCGATATTT 
CGGAAGAGAT 
GGGAGAAGCT 
AGGAATGAGA 
TTGAAAATAG 
TTCCAAGCAT 
GCGACGTAGT 
TAGGGATTCC 
CGAATAACTT 
TAGAGCTGTT 
TTGGAACGAG 
ATACGTTTTC 
GAATTTCCAT 
GATTAAACAA 
GTGCTAAGGA 
TTTTAATCAT 
AATCTTCATT 
CCTAATTGCT 
ACGTAAAGCA 
CGTCTCCTAT 
AACTTATCGT 
TAGATCAGAA 
TAGGAACGCG 
TTACGAGTTT 
ACCCTGCCTT 
CAATAGCCAT 
TGATCACAGG 



GGAAGAGAGG 
GCTGTTATAG 
GAATAACGCT 
TCTAAAAATT 
TGGGAGATAA 
GTGCTGAGGG 
AGATTCCCAA 
ATCCAAATCT 
TTCTCCATTT 
GAGTGACCTC 
CCATTTACAG 
AGGAAGAGTA 
CTTCTTCTAT 
GAGTTATTTT 
TTCTAGTAAT 
GACCAAGGAT 
GTTTATATTT 
GACGTTTGTC 
GAAAAGAATA 
GAGATTGCGG 
TAATGATTTT 
GGAGTTTGGA 
ACTTTCAAAG 
CTGTCATGAG 
TTTAATTCTG 
ATCTTGGTAG 
TATGTGTGTC 
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ATAGGGAAAG 
GGACTCGTAT 
GAAGCCCTGT 
CACCGATTCC 
ATCCCGTGTT 
TGAGTGTTTT 
AACATGAAGG 
TAGCATGCAA 
AGGTAGTTCA 
ACCGTAGCCT 
AGAGAGCAGC 
CCTAGTAGTG 
TTGCACAGGG 
TATCTGGAGC 
TGCTCGTATT 
AGGGTTTAGG 
ACAGCATCTA 
TGCTGAAGAG 
AGTTCAAGAC 
ACTTCTCGTT 
ATGGTAGGAC 
TTCCCTCAGC 
TAATCGTACG 
AACTTGTAGT 
TTGCCATTCC 
GTATAGCGGT 
TCCAGTCACT 



GAATAGGGGA 
CTTCACCAGT 
TTTTGGCACC 
ACAGCTATTT 
TAGGAGAGGT 
GTGTATTTTA 
TCAGGAAGAT 
AGAAAAGAGC 
TACTCAGGAT 
AGAATTCCCT 
TTTGAAAATA 
TCGATACTGC 
ACATGGGTAT 
AAGTGTCGTA 
GGTTTTGCAT 
GAAGTCTGAT 
AAGCATTAGC 
AAAAGGCGTG 
ACCGGTCACA 
CTACTTTAGA 
TCAGCTACCT 
TTTTAAGCTA 
AGGTAAGAAC 
TGCGCACTGG 
TTTCATTAGA 
CTTGAAGCAT 
TCTAAGTACA 



236201 
236251 
236301 
236351 
236401 
236451 
236501 
236551 
236601 
236651 
236701 
236751 
236801 
236851 
236901 
236951 
237001 
237051 
237101 
237151 
237201 
237251 
237301 
237351 
237401 
237451 
237501 



AAGAGCGAAG 
GAGGCAAATA 
CGTATGAGTA 
AGGGTGGAGT 
GGGTCTTTAA 
ATCGTCTCCT 
CAGAATTCCC 
TCTGTAGATT 
AGATTCTAGA 
CTGCGGGATT 
CCTTCTTGGC 
TTCTACAGCT 
CTCCTGATGC 
AATCTGAATT 
GATAAGAAAT 
ACATATCTTC 
TGCAGGTGTT 
CAGAATCAGG 
GCTCCACGAA 
TGCTGATTGT 
ACCGTACATA 
TCGCTCAGAG 
TTGAACCAAT 
TTGTGACTTC 
ATAGAACGTA 
AAAGTACGAA 
TTGCTGCCTT 



CCCTGAAGGA 
AGATGTTTTT 
TTCCTTGCTT 
CGTTTGAACC 
TTTTCTCTTG 
ATTGCATCAC 
AGAAGCATAC 
CAGATTTTTT 
GTTTGAAACT 
TGTCAGGTCC 
TGGCTACAAC 
GCAAGGTTGA 
TGCCATATGC 
CTTCCTAAAG 
CACAGCAATG 
TTACATCAAT 
GGCGTGATCG 
ATCTAGAGCA 
TCATCTCTTC 
CCTTGAGAGA 
TTCTGTAAGC 
ATTCTAAGAT 
CTCTTAAAGA 
CTTCACTAAG 
CCTCTTGAAT 
AGATGGAGAA 
CTCTAGAATA 



GAAACATTCA 
CGCACCAATA 
GGATAAGCGC 
AGGTAGTCCA 
TACAAGAGCA 
GCAGGCCGCG 
TTATCAGCAA 
CTCAGCCTTT 
TCTCTTCCTT 
TGAGATTGTT 
TTCTGCTGCA 
CACCCTGAGT 
CTCCTATGAG 
GCTGGATGCG 
TCAGGGAATT 
TGCTGTCAAT 
TATTCCTCAT 
AGGTAAGAAC 
AATTTCTGGA 
ACTTGAAGCT 
AAAACTGTAT 
TGTACGTAGG 
TTTCCGTAAG 
TCCGGGAATG 
TCCCAAAAAC 
TGATCACTTC 
GCTTTTGCAT 
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GTTGGTCTGC 
GCAGTTCGTC 
TTCTTTTAAT 
AAGCTGTGGA 
AGAATGTCTT 
AAGTTCTTGA 
GATCTGTGTC 
CCAGCTTCTC 
TTTTTTCGTG 
GAATCATGTT 
TCTGCTTTTG 
GCCTCCTAAA 
CGACAACGTA 
GATTTCTGGT 
CTGTTTCTAT 
AATACTGGTG 
AGATTTTAAA 
CTGCCGATGT 
TCTAAGAGAT 
GATATAAAGC 
CTTTCTCAGT 
TCTTTAATTG 
CTTTTGCAAT 
AACGTTCCAT 
TCTTGAGAGC 
GAGCGGCGTC 
CTTCACTAAC 



ATATTCTTGA 
CGAATTGCTC 
TTACCTTGGG 
TTGCAGAGCT 
CTGGAGAAGC 
CCAGAGATTT 
AGGCTTCTCT 
CTTTTTTCCG 
CGTGTTGCTG 
CATCTCAGAA 
CAGCTGCAGC 
CCACCTGTGC 
TCAATTAGAA 
AGGATTTCTT 
TAATTTTCGT 
GTTGGCCTCC 
ATTAGGTTCA 
CTGTTTAATT 
AAACAGAAAT 
TTTAAAGAAG 
TTGCGCCCAC 
AGATTTGCTC 
GGAATAAGCC 
AAATTCGATC 
TTTTATGGAA 
CAATATTTAA 
CCAAGCTGAA 



237551 
237601 
237651 
237701 
237751 
237801 
237851 
237901 
237951 
238001 
238051 
238101 
238151 
238201 
238251 
238301 
238351 
238401 
238451 
238501 
238551 
238601 
238651 
238701 
238751 
238801 
238851 



GGAAGACCCG 
GAGATTGTCC 
CTCGCACATA 
AAAGAAGGGG 
ATCCTGATAG 
TTTGTCCTGA 
AGTTCTAGAA 
AACAGTAGCA 
AGCCACCACC 
CCTAAGGCAA 
GCCCTTGAAG 
CTCGTGGTTC 
TTTTTGTCAC 
CAAAAGAGAA 
TGTAGACGTG 
ATCGTCAAAC 
AACGTCTCCT 
CACTTTCCTT 
ATCATACCAG 
CGCATCCAAT 
TAGTTACTAC 
CCGACCACAT 
ATGACCCGCA 
TCCCCAAGCG 
AGCTGCAAAG 
GATCGATAAG 
GAATAATGAT 



CAGCATTCTT 
TCCACCTCAT 
AGGGACTTCA 
AATCTGTGCG 
AGAGCTTGCC 
TTTTGTCTTG 
TTACGGGAAG 
GCGCCATCAC 
TTTTTTTCCT 
CGAAAATTAA 
AAACCAACCC 
TTTAACGAGC 
TCGATACACG 
GGAATTTGAG 
AGCTGCTTGA 
CGCCAACAAT 
TTGATGAACT 
TTGGATTTGA 
CTCGTAAGTC 
CGGAATCGGG 
GATAAACTGA 
AGTTCCCTCC 
TACGCTTTAA 
GAACATCGTA 
CACTTGGAAT 
TTGATGGTGA 
CATTAAGACA 



ATAGGTAATG 
TGGTTAACAC 
TTAAGCAGAA 
AACATGAATG 
GCATTTTAGG 
TGTTGGATAA 
AGTTAGAGAA 
CAGCAGCCCC 
GCCGCTGATT 
TGCTAAAATG 
CTAAAGTTGC 
TGAGTAGAAA 
AGTCGTGACA 
AGACTAAACC 
GCGAGGTCCA 
GTTAATCAAA 
TCATGGCACC 
GCCCTTTTAT 
CGCATCAATC 
CAGCAACTTC 
ATGATTGTAA 
AACCACGAAG 
GGAGAATCTG 
GTGATGAGGA 
ATAAAGAGCC 
TCATCAAGTC 
ACGCCCATCA 



AAAGGTAGAT 
ATGGTGCGGA 
TCATATAATC 
CCAGGGTATC 
AATCATATCA 
GCTTAGAGAG 
TAGTCATCGG 
TACGGTTGTT 
TCTTAGTCAG 
GAGAAGGACC 
AGCACCTGCA 
TCTCTTTACC 
ATACCCGCTG 
ATCTCCAATG 
TGCCGTGCAT 
GAGATAACGA 
GTCCATGGCT 
CACGAGCTTG 
GCCATCTGTT 
GGCAACACGC 
TAATGAGGAA 
TCTCCGAAGG 
TCGAGAAGAG 
GCAACGAGGG 
ACCATCAATA 
AACGATAGGC 
TCCAAAGAGC 



TATAACGGCT 
GGAATTTTTC 
GTATCCTTCT 
GGATTCCGAT 
TCAACAAAGC 
ATCTTTTCCA 
GATTATCCCC 
GAAGCTCCTG 
TAGGAGAATC 
ATAGAGGGAA 
AGGAGTAGTG 
CAAGTTCGTA 
TCAACGCAAT 
GAGAGAAGAG 
AGCCACCCCA 
TACCAGCGAT 
CCGTAGAGTT 
TGTGGCATCA 
TACCTGGCAT 
TCGGCACCCT 
GATAATGAAC 
CCTGAATGAC 
GAAATATTAA 
AAAAACAGAC 
AGAATACAGA 
GGAGGCAAAG 
AAGGATTAAG 
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238901 
238951 
239001 
239051 
239101 
239151 
239201 
239251 
239301 
239351 
239401 
239451 
239501 
239551 
239601 
239651 
239701 
239751 
239801 
239851 
239901 
239951 
240001 
240051 
240101 
240151 
240201 



TCGCTGGACT 
GCTGACGAAA 
ATTTTTATTA 
TTTCTCCAAT 
CCTTCATCCA 
TCCGTACTTT 
CCATGGCAAT 
GCAACAGCAA 
CTGTGATGAC 
GGCCTTTAAT 
TCAAACTTCT 
AAGGTCAAGA 
CTTTATAAAA 
CCTGCAGTTT 
AATTAAGGCT 
GAGTCTTTAT 
TTGATATCTG 
AAAACCTACA 
ATGCTGTTAA 
ACAGGATCAT 
ACCTAAATGC 
AG AC GAT AAA 
CCTTTTTTCC 
TTCACCCATG 
ATCTGCAAAT 
GTTTTGTTCT 
GTAGAAAAAA 



TATTGATCAT 
TTGAGTAGCT 
TTAGGATTTT 
AGCTTCGTAA 
AAAGCTGATG 
TCAGCTTCAT 
GATCCAAGGT 
TATCTTTGGG 
GAGTCTTCAT 
CTCAGGATTT 
CCATCTTTAA 
ATCGCAACAA 
AATTTCTTTG 
CTATAATTAA 
GCTCCAAAAA 
TTTAAACTTT 
GTTTAAAAAC 
ATGACGCCAA 
AATAAGCATA 
GGCGAGTGGG 
TTGAAAAAAA 
GGTAACCGCA 
GAGCATCTCT 
CTGTTGCCAA 
TACCTTATGC 
TAGTAAGATT 
AGAACAATAA 



GTTTAAGGCG 
TATTCATTAT 
GCGCATTCAG 
GTAGATTCTG 
TGCTAAAGGT 
CAAGTATCCT 
GCTTTATATT 
ATTAGAGACT 
AGGCAATTTC 
CCTTCCGTGT 
TTCTTTAGCG 
TCAAAAAGAA 
AAGATTTGAG 
AGAGACTTTG 
TTTTTAAAAT 
TGTTTGATGT 
TTCGGTAGAA 
CAACAGCAAC 
AGACAGTTCT 
AGCTTGTGAG 
AGGTCGATAG 
GAAGGAAAAT 
AAGTCGCTTC 
GGTTCGATTA 
TCTTGTTTTT 
ATAAATGAAT 
CGAAAAGTTG 



GTATCGCCAC 
AAATGATCAG 
TGAAGTGATA 
GAATAAATTT 
ACGTTTCGCA 
TTTAGCTCGT 
TTTCAGGCAT 
ACGGTGCTTG 
TTGAGCAATT 
CTTTAAACTC 
AAATTGTGGC 
AATTCCTATC 
CAGTAATTAT 
CTTTTTAACG 
CGATTTGATC 
TCTCAATAGG 
AATGTAGGAC 
AGCTCCCAGT 
TAAGATAAAA 
AGCATGGAAA 
GGAGAAAGCC 
CCTGAGATTT 
GGCGTGGCCT 
AGAGGACTAT 
CTCAAGAGTG 
CTTAGAGATA 
TTGAATTTAG 



CAAGTGTTCT 
GTTGGTTAGT 
TAGAGTAGAA 
TAATTCCTTC 
TAATGGGAAT 
AAGTTGATGC 
GTAGCCAATA 
CATGTTTCAC 
TGTCGACGAC 
CTGCTTAACC 
GCTGATAGAC 
GAGGTTACTG 
AGGAGAGACT 
TTATGTATAA 
AGCTCTATGA 
GTTGAACTTC 
CAACGATAAG 
AAGGGAAGTG 
TAAGGTAATT 
CCAGAAAGCC 
GTAAACATAG 
TGCTACTTGA 
TTTCTGTTTT 
ATTCGGAATT 
AAGCCCAAAC 
AAGAAGAATA 
CAATACTTTA 
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24 0251 GGTCACTTCA AGATGGTTCT ATATGAAGAA AGTATTAGGT AGGGAAAACT 

24 0301 TCAATCGATT ATAGTGGTTT TTTATCATTT TCAAAGCACT CTTGATTTTG 

2403 51 AAGGTACTAT TTTTCTAGCT ATTGTGCTTT TTTGAAACAG AAGAGTTGTC 

240401 TTTCTAATAT TCGAAAAAGC ATGTCATTAT TTTTATTTTT AGATGTCTTA 

240451 TGAGTCATAC TGAATGTGGA ATTGTAGGGC TTCCTAATGT AGGAAAGTCT 

24 0501 GGCTTATTCA ATGCTTTAAC AGGAGCTCAA GTTGCCTCCT GTAACTATCC 

240551 GTTTTGTACT ATCGATCCTA ATGTGGGTAT TGTTCCTGTT ATCGATGAAA 

240601 GACTGGAAGC CTTAGCTAAA ATTAGCAATA GTCAGAAGAT CATCTATGCG 

24 0651 GATATGAAAT TTGTAGATAT TGCAGGTTTA GTTAAGGGAG CTTCCGATGG 

24 0701 CGCGGGTCTG GGAAATCGGT TTCTCTCTCA TATTCGAGAA ACTCATGCTA 

24 0751 TTGCTCATGT AGTGCGTTGT TTTGATGATC CAGACGTTAC ACACGTTTCA 

240801 GGAAAAGTCA ACCCTGTTGA GGATATTGAA GTTATCAACT TAGAGCTCAT 

240851 TTTTTCTGAC TTCTCCTCAG CAAAAAATAT CCATAGCAAA TTAGAAAAGC 

240901 TAGCCAAAGG AAAGCGTGAA GTAGGAGCTC TCTTGCCTCT ATTTGATACA 

240951 ATTATTGCTC ACTTAGAAAA GGGGCTGCCG CTACGTACTT TAGAATTAAC 

2410 01 TCCAGAACAA ATTGTGGCAT TAAAGCCCTA TCCGTTTTTG ACCATGAAGC 

241051 CTATGTTTTA TATAGCTAAT GTTGACGAGA GTTCTCTACC AGATATGGAT 

241101 AATGATTATG TTGCCGCTGT TCGGGAAGTT GCTGCAAAAG AAAATTCTAA 

241151 AGTGGTTCCT ATCTGTGTTC GTATAGAAGA AGAAATCGTT TCCTTACCTA 

2412 01 TTGAAGAGCG CTTAGAATTT CTTATGAGCT TAGGTCTTGA AAAATCAGGA 
241251 CTTCATAGAT TAGTGCGTGC TGCGTATGAC ACTTTAGGAC TGATTTCTTA 

2413 01 TTTTACTACA GGTCCTCAAG AATCTCGTGC ATGGACAGTG GTTCGAGGGT 

2413 51 CTTCTGCTTG 'GGAAGCTGCT GGAGAAATCC ATACGGATAT TCAAAAGGGC 
241401 TTTATTCGTG CTGAAGTGAT TACTTTTGAA GATATGATAG AGTGTCAAGG 

2414 51 TCGTGCAGCT GCTCGAGAAT TAGGGAAATT ACATATAGAA GGACGTGATT 
2 415 01 ATATCGTCCA GGACGGTGAT ACTATGCTGT TCCTTCATAA TTAAAGGAAC 
241551 CCTTTGCAAA CCAATCTTGA GCATCCAAAA TGTCTTTTTC AATTGCTCGT 
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241601 
241651 
241701 
241751 
241801 
241851 
241901 
241951 
242001 
242051 
242101 
242151 
242201 
242251 
242301 
242351 
242401 
242451 
242501 
242551 
242601 
242651 
242701 
242751 
242801 
242851 
242901 



ATTAGAGTTT 
CGGGATAATG 
TATGCGCCTC 
AAATTCATAA 
ATAAACTCCT 
CGAATCCTAG 
ATGGCATAGG 
GGACAGAAAC 
GGTAAGGAGG 
TCGAGAGCCT 
ACCTAAGATG 
ATTCTTCTGC 
AACCAGTCTA 
ATTGATGAGT 
CAAAGGTAAT 
AGATTGCTAT 
ATCTACAGAA 
CGTAGGTAGG 
AATACATTCA 
CAAGATAAGC 
CGAATATAAG 
CTCATATTTC 
CTATAGATAA 
TTTTTAGCGG 
CTCGGCAGCT 
CTACAACTTT 
GCAATTGCTT 



CTTTTGATTG 
CTCACTTCTT 
TGCATATAAA 
CACCCTGACA 
AGGGGAATTA 
AGAACCTCCT 
GATGACCCAA 
TGGCGGATTG 
AATCTTGATG 
CGGTATTGCT 
AGGCGTTTGC 
CGATTGATTC 
TGGGAAACGT 
TTCGTGTGAT 
AACTCCACTG 
GCCCTAGATG 
AACGAAGACG 
GAGAAATATC 
TCTATAGAAA 
TCCACAGCCT 
TTCCTTTGCT 
GTAATCTGCA 
ACCTTTTCTA 
AAAACATGGG 
GATAATACTT 
GCCGTCGCAA 
CGTATTCCTT 



AAACTTTTTT 
TGCCGTATAG 
GACTCTCTTC 
GGTAGTGCTA 
AACTTTCTTC 
ATTCCGGAGC 
AAAACGATGA 
CTTTGCTGGA 
ACCTCTATAC 
TTGCTGTTCT 
ATTTCAAGTT 
GCAAAATTTA 
TTGCAATAAT 
TTAAAGAAAG 
GATCCAGAAT 
ACATCCGTCG 
TTAAACTATA 
GAAATCGGGG 
AACGGCCACT 
AACATCGTGC 
ACAAGAGACT 
AGTGAACTTG 
GCATATTCAT 
AGGCAGTTGC 
CTTCGAGACT 
TCATAAGAAT 
GTCTTCAAAA 

180 



TCTTCTCTAA 
ATTTTCCGCA 
CAAAAGTAGG 
TCATAACGTA 
TCTAGGAAGA 
CCTCGGTTAT 
GCACATTCAA 
GACAACTATG 
CTAACGGCTT 
TTCCCTATGC 
ACGATGTAAC 
AATCAAAAGT 
TGGAGGCGCT 
TACCGTTTGA 
AGGAAGTAAG 
AAAAAACCTA 
GGCTATTTCC 
TGGTCTAATA 
GCGTAAACGG 
CAAGCTCATG 
ACAAAATGCA 
AACTGTAGAA 
ACAGCTTTTT 
TGGATCTCTC 
AGGAATCTTC 
CGGTAGTTGT 
AGTAAAATAT 



GAAATTTTCT 
AAGGAAAAGA 
GGCAGTTCCT 
TTTCACAAGC 
TTTATAGTGG 
TTTTCCAGAA 
GATTCCCTGC 
TTATCCATAC 
GCCTATAGTA 
AAGAATCATA 
AAAGTAAGAA 
AAGGACACCT 
CTTCTTTTGT 
GGATGAGAAT 
AATAGATAAA 
CAGTTACAGA 
ATGGGCATCT 
GATTCCCATC 
CGTAGCTGCT 
AGCAATGCTG 
ATAAAGGGTA 
TGGTGACGTT 
CCCTTGGACT 
CTTGGAAATA 
TTAGATCTTC 
CCCTAAATGG 
CAGAAAGTCT 



242951 AGTAAATTTA CGGCCAATCA 

243 001 CTAAAGTTCC TGCATGACCA 

243 051 GCGCGGATAA GGCTAAACGA 

243101 AATGCCCTCT TTTAATTCTA 

243151 TATTCAAGTT CCCAAAAAAT 

243201 ATCTGCCAAA GCAGGTTTTC 

2432 51 GAGATAAAAA TGAAGTTCTG 

2433 01 CTCTATGAGC GATAAAACCA 
243351 TCCTTGGTAT TCTCATGAGG 
243401 CAAATCCTTA GATAGAGAAA 
24 3451 TCTTGGGATG CTTAACATCT 
243 501 AATAAAGCAT TTACCCGTTT 
243551 CAAGTTATAG TTTTTGTGGA 
243 601 TCACCTATTT GAGCTTGCTG 
243 651 ACCTTTGCGA ACTTCTTTGA 
243701 ACGTACCTTT CCAAAGGATC 
243751 TTGCGAGTCA TAATTCCTTC 
243 801 TTGTGAAGAC CTAAAGATTT 
243 851 CTTCAGCAAT AGGATCTAAT 
243901 ATAGCATGAT AGATGACGGT 
243951 AATTAAAGGT TCCGCATGAC 
244001 CTGCTTTAGA GGCGGCAGCT 
244051 ACACTGTTTG TTAAAATTTC 
244101 TATTGAACTG ACCAAAGCTT 
244151 TAAGCTTAAG AGTCTTTTTA 
2 44201 TTCTTTTGCT GTAAAGCAAA 
244251 AATGTCTCTA GCCGTTTTCT 



ACATGACCAT AACGCCAGTA GCGAAGGGAT 
ATCTTTTTAA CGCCTATTAA CTTGGTTAGA 
AGTTCTCCCT TGAGGCTTGT CTACAAGAAG 
CTGCAAGATC CATAGTCATG TCTTTAATAG 
ATAGTTTATT AACTCTTTTC TTTCTCTTGA 
TATATAATCT TGAGGTGAGA AAATATCATC 
GGAAATATTT AAGGACGACA TTTTTCGAAG 
GCAGAGACTT TTAAAGCTTC TAAAGCCTCT 
CATTACAGAT ACATAAACAC GTGCAGAGTG 
CACGAGTTAC CGTGATCCAA AGATTAGAAA 
TTTAAAATTA CCTTTGCAAT GGCTTCTTGT 
AATACGTCTA TTTTCTGTCA TACAGTACTT 
TGATAGATAA CTTCATAACA TTGTAGGACA 
GTATCCTTCT AACAAAATTC CACACTCTAA 
CATCTTCTTT AACACGTTTT AATGAAGATA 
TCTTTATTAC GTAATACTCG GACTTTATGA 
AGTAACTATG CAACCGTAAA TAGATCCTAC 
CTTTAATCTC AGCAGAACCT TCATCTTTTT 
AGAGAAGTCA TAATTTCTTT AATTGCATCA 
AAATAGTTCA ACTCGGACTC CTAAGCTCTT 
TTTCTATTCC TGTATGGAAA CCGATGAGAA 
AAACGAATGT CTGATTCTGA AATTTCTCCT 
AACATCTACT TTTTCTGATT TAATCTTAGA 
CTATGGAACC TTGAACATCA GCTTTAATCA 
TTCTGTAACA TAGAATCAAA GTTAGGCCGC 
ACGCTGTTGT CCTGCGGATC TAGCTTCAAT 
CGTTTTTCAC GACGAAGAAA GGATCGCCAG 
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244301 
244351 
244401 
244451 
244501 
244551 
244601 
244651 
244701 
244751 
244801 
244851 
244901 
244951 
245001 
245051 
245101 
245151 
245201 
245251 
245301 
245351 
245401 
245451 
245501 
245551 
245601 



CTTTAGGAAT 
GCTTCTTTCA 
ATAACAATCA 
GAATCAAAAC 
ATAACAAGTC 
TTCAGCTTGT 
TTTTTGCGGA 
AATAGATTGA 
AGGCTTATCA 
TTGCATGTTC 
GCGACTACAA 
TGCAGAGAAA 
CTGGGGTGGA 
TCTGTTGCAG 
TCCGTGGTCG 
GCTTGCTGGG 
AGGCACAACT 
CTCTAAGCCA 
CTACATAGGT 
TTGAGCTTCA 
TTTGATATGC 
GCTTATAAAC 
AATCCGTAAC 
CTTAGAACGA 
TTGGACCAGA 
TTATTCTGTT 
AGCAACGGGC 



GTCCGATAGA 
TCAATTCATT 
TTGAAGACGA 
AGTCGCAACA 
CTCGAGCACG 
AAAGCTAACA 
GGTATTTACT 
TTTCAGAAAG 
CACTTGTTGA 
AATAGCCTCT 
GCACAACAAT 
GCTTCGTGAC 
GCAGCAGAAG 
CGACATTACT 
ACGTGACCCA 
ATCTGTAGAT 
TATCTTGCTC 
ATAAATTGTA 
CATTCCATGA 
TTTCTGCTGC 
GTAGGTCGCT 
ACGTTTTTTT 
GATCTCTTCC 
TCACGAAAGT 
AGCAGGACTA 
CACCACCTTC 
TTTGTGCTTT 



CCTGTGATCA 
ATGTTCGTTA 
GAGCTTCGCC 
GGTCCGAGAC 
TGCTGAAGGA 
TCTCTAAAAG 
GTAACAGTCG 
TTGTCTATAG 
TAGCTACAAC 
AAAGTTTGTT 
ATCACAAACT 
CAGGAGTATC 
GCTCCCATGT 
TTTCCTTAAG 
TAAACGCAAC 
TGAATTTCGT 
AGAATAGTCG 
CTGCAGTTTC 
ATGAATAACT 
CAGATCTTTG 
GGATAGAGGC 
CGCCATCTGT 
TGTAAAAGCC 
CGGTAAGATT 
CGGTTAGCAG 
AGATGTTCCA 
TCGAGCCAGC 



ACACAGGAAT 
TGCATAGTTT 
CAGTTTTAAG 
CCTTGTGCAG 
TCGGCTTTTA 
TTCTGAAAGA 
AGCCTCCCCA 
ATGGTTTCGG 
AATAGCGATA 
CTTTAATTCC 
TCAGCTCCAC 
TAAAATTGTT 
GTTGGGTAAT 
GAGTCAATGA 
AATAGGGGAG 
CTCTTACAGT 
ATGTCAATTG 
GCTGTCTAGA 
TTTGAATGAC 
ACGGTAATTG 
TTCGTCATAG 
CTTCTTCTCC 
TTTAGGCTTT 
TTTCTTCCCA 
GATTGAATTG 
GGTTTCCCTG 
TACGACTTTC 



AGATGGCCCA 
TCACTTTGCC 
CTTCCATTTT 
TTCTGATTCA 
GCTCCAAGAC 
CCTTCTCCTG 
AGCTTCTGGC 
AATTAAAATT 
TCAGCAGCTT 
TTCGTCTCCA 
GGGCTCGCAT 
ATGTCTCCCA 
CGCTCCAGCT 
GTGTTGTTTT 
CGAATCACAA 
GTCATTGCTT 
TACATCCAAA 
ATATCATTGA 
TTCTGAAGCC 
GCAAGGAAAT 
TGTTTTTTAG 
GCCTTCATTT 
CATCAGATTT 
GCATCTCTTT 
TTTTTCTCGA 
TTTTATCTGA 
TCTTCCTTGG 
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245651 
245701 
245751 
245801 
245851 
245901 
245951 
246001 
246051 
246101 
246151 
246201 
246251 
246301 
246351 
246401 
246451 
246501 
246551 
246601 
246651 
246701 
246751 
246801 
246851 
246901 
246951 



CAGGAGCCTT 
GGGCCGAACT 
CACAGGCTTA 
CGGGTTGCTC 
TCACAAACCT 
GACTGGAGCA 
CTTCGGATGA 
GTAGACTCTG 
GGCAAGAGCT 
AAGATTTAGC 
TCCAGCCCCG 
GTTTTTCGTC 
TTATAAGCAA 
ACTCGCTAAT 
TTTGGATGAC 
TGCGGACTAT 
ATTGTACTCA 
TAATTAAACG 
GCATCATTAA 
CTTTTGGATT 
CGGGGGAGTA 
ATATTTTTTA 
AGTTTTAGGG 
CACGAGCTAT 
TCTTGAATAA 
AACTTCCGCT 
AAATTTTATC 



GAATGTTTTT 
TAGATTTAAT 
GGCTCTAATT 
AGGAAGAACT 
CATCGACTAC 
GGTTCAGATG 
TGAGAACGAC 
GTGAAGCTTG 
ACTTTTACAG 
TTCAGAAGAT 
CGGCTTTCGT 
AACTTTACTT 
GCTCTAAACT 
AATACTCTTC 
TAGCTTACTA 
CGAATTCTGC 
CTCATACGTT 
AGCGTTAATT 
CGACAATTGC 
TCTATTGGAT 
ATTGACAATG 
CTCGAGAACC 
TCTGACGATC 
CTTAACAATC 
ATAATTGTTT 
CCACCATTTT 
ACCGATCTTA 



GCTAGGAGAT 
CATTACGACG 
CTTTTTCTTG 
TCAGCAACTG 
TTCTAACTCA 
TATCCACTGG 
GAACGATTTT 
TTCCGCACTT 
ACTTTTCTTT 
CCTGCTTGGG 
TAATTGAGCA 
TCTCCATATT 
GATCCCAGGA 
TAATTGTGTC 
ATCCCTTCCA 
TAATTGAAGG 
GTACTTCGAG 
CCTCGTTTAC 
AATCACTTTG 
AAAGAAGATT 
TCAATTTTCT 
TCGCATACCT 
TTACAGCTAG 
TCCACAGAAC 
AACAAATTCT 
CAGACTCTTG 
TGTTTTTCTG 
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GATTGATATG 
CTTTTAGGTT 
AGGTGGGGTT 
GATGAACCTC 
GGCTCAGGAT 
AATATGAGCA 
TAGCACGAAT 
GCCGTAGGGG 
CGCAGAAGGT 
CAAGTTTTTG 
TTCTTAATCT 
TGCTGACTTG 
ACAGATGCCA 
ATATCCAGCA 
TTTCTAAGGG 
CGTTGAATTT 
CTCGTAGTCT 
CAATAACAGT 
TCGTCTTCTA 
CTGTAATAAC 
CATCGTTCAA 
ACAAAAGCTC 
TTTCGTGCGG 
CTTCTTCTAG 
GCGTGACTAC 
AACTTCATAG 
TTTTAGGATA 



CTTTCCTGTA 
CAGCAGGCTT 
TCGGGCAATA 
AGGACTTTCG 
CTGCTATGGA 
GAAGACTCTT 
GCGACGTGAA 
TAGAAGTTGC 
TTTTCTGAAG 
TTTTAGCTTA 
TCAATTTCAG 
CTCAAGGATC 
GATCATTAGC 
TGTTCTAAAT 
TTGATCTAAG 
CTAGCAACTT 
AGAATGTGGC 
AGCGTAGTCT 
AAATAGCAAT 
TCTGTAGAGA 
TTCTCGAATG 
CAACAGGATC 
TACCCAGCTT 
TTCTGGGACT 
GACTGAGGAT 
AGTAGGGCGT 
AAACCGGGTA 



247001 
247051 
247101 
247151 
247201 
247251 
247301 
247351 
247401 
247451 
247501 
247551 
247601 
247651 
247701 
247751 
247801 
247851 
247901 
247951 
248001 
248051 
248101 
248151 
248201 
248251 
248301 



GGAAGAATTG 
AGCAAAACGT 
ATTCTTCATA 
ATTTGTCGTG 
GGGACATCCA 
TTTATCTAAA 
CCTTTTCACA 
GATATGTTCG 
AGATTCGATA 
CCATGTAGTC 
TAAATAAGGA 
TAAATAATAC 
TTCTTTCTCT 
AGAGTCTTGA 
AAACTTTTTT 
TCTCCAATGG 
TTCTGAAACG 
CTCCAAATGC 
GGGAACATAG 
TCCTAAAGTA 
CTACAGAATT 
ATCCAACTCA 
TTCAACGAAA 
GGAGACCTAT 
GTTTGCTTTA 
AACAATGGCT 
CAATATTTTT 



CTTCAACTTT 
TTGACAACAC 
AATAACGTCT 
CTGCGTGAGC 
TGTACTGACC 
GGAATTTCTT 
AAAGACTTCG 
CGTCATCTCT 
GCTCCTATAA 
AAAAATAGCT 
AATCAGTAAA 
GATCCTCTGC 
CTTTAGGGCC 
TCATAAGCAT 
ATGATCTGGA 
AGATAATATC 
TGAATCAATC 
AGTGATTTTA 
CTTCAATTTC 
ATTTTTTTAC 
TCCTTTTTTG 
TGTCAGAAAT 
GCACCGTAAT 
AGGATATTTT 
AT.CCTAGAGA 
TCAACTTCAT 
CACCCAAGAC 



TCCTAAGTCA 
CAGATAAAGT 
CTCTCAGCAT 
AGCTATCTTC 
AATCTGACAG 
TGCTAGGATT 
ATGTCACCAG 
TAAGGTTTTT 
TAGTAGAGCG 
ACAAGATTTT 
TAGCTAAATT 
ATTACCAGCA 
TTGAGAATCC 
TGTCAGCTAA 
TCTAGCTTAA 
TTCAATTTTT 
CTTCAATCCC 
GTCACAACTC 
ATTCCAAGGA 
TTTCTTTGTC 
AATAGTTCTG 
ATGAATCAGA 
TGGTTAAGTT 
TCTTCGATAT 
AATTTTTCCT 
CGCCTTTATT 
ATTTCAGAAA 



ATAATTAAAT 
TTCATTTACG 
GTCTTAGCTT 
CAAAATTATC 
TCCGGATCGT 
CTGACAAATT 
TACGAGAATT 
TTAGCAGCAA 
CTGAATCCCT 
TATTCATTAT 
AAGGACGTAT 
TTAGATGCTA 
TTGAAATCTA 
GTATTCTTTT 
TTACTTTTGC 
GCAAAGGGCT 
GTTTTGTAGC 
CTGAAATTAC 
TTAGAACTTA 
TACTGATAAA 
AAGGGTGAGA 
CCCTCAATTC 
CTTGATTTCA 
TGTCCCAAGG 
TCGTCCTTCT 
TACGACTTCA 
TGTGAATTAG 



TAGAACCTTT 
CGATGGCGAT 
TTGACCGATA 
AGAAACAAAA 
ATTCTCTGGC 
TCTACTATTT 
AATGTTTACA 
TTTTTAAAGC 
TTTTCTTTCT 
ACTCCTCTTG 
TATTTAGAAC 
TTTTCCTTTT 
ATTCAGTCCT 
ACAGAAAGAG 
AGAAACATTT 
TGTCAGAAAG 
TCAACAAAGG 
TGTGCCAGCA 
ATTGCTTAAC 
ATAACAGCCT 
GACTTTTTTA 
CTGGTTCTAA 
GCATTGACAT 
ATTACGTTCT 
GAATAGATAG 
CTAGGATCTA 
ACCTTCAATG 
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248351 CCCTCTTCAA TTTCAATGAA AGCTCCGTAG GGGAGAAGCT TCACAATTTT 

24 8401 ACCAAGAACT CGTTTTCCAG GAGGGTATTT CTTCTCAATA TCTTCCCAAG 

24 8451 GATTATGCTC TTTTTGTTTG AGACCTAGAG CAACTCGTCC TTTTTCTTTA 

248501 TCTACGCTTA AAATAATTAC TTCCAACTCT TGATTCAATT CGACCATTTC 

248551 GGAAGGATGT CGTATGCGCT TCCAGGTCAT ATCGGTAATG TGGAGAAGAC 

248601 CGTCAATACC ATCGAGATCT AAGAATACAC CAAAGTCAGT AATGTTTTTA 

248651 ACAACTCCTT TGCGGTATTC TCCGATAGAA ATTTGTTCAA TAAGTTCGGC 

248701 TTTCTTAGAG ATTCTCTCAG CTTCTAAGAG TTCTCTTCTT GAGACAACAA 

248751 TATTGCGACG TTCAACGTTA ATTTTTAAAA TTTTGAATTC ACAAACTTTT 

248801 CCGACATAAT CATCTAAATT TTTGATTTTC TTGTTGTCAA TTTGTGATCC 

248851 AGGTAGGAAG GCTTCCATTC CAATATCTAC AATAAGGCCG CCTTTGACTT 

248901 TACGTGTAAT TTGACCTTTA ACAATAGAAC CTTCTTCACA ATGAGCTAAG 

248951 ATGTATTCCC ATTGACGTTG TCGTGTGGCT TTTTCTCTAG AAAGGACAAC 

249001 TTTGCCCTCT TCGTCTTCGG CTTGGTCGAG ATAGACTTCT ACTTCAGCTC 

249051 CAAGCACTAA ACCTTCTGAA GAGTCTATGA ACTCTGACAT AGGGATCACT 

249101 CCCTCAGACT TCAGACCAAC ATCAACTACG ACAAAGTCTT TATTAATATC 

249151 AACTACGGTA CCTTTTAGGA TGGCGCCAGG CTGTATTTCG TTATCAGATT 

249201 CTTCTTCGCT CGAAGTAATT CTGTGTGCCG TATAAAGCAA ATCTTTAAAT 

2492 51 TCGGCAACGT CTTCTGTGAG GCATTCTATA TTGTCCAGAA TTTTTTTAGA 

2493 01 TCCCCAAGTA TATTCAGCTT GTTTTGGCAT TTATTGTTAT TCTCCTAAAA 

2493 51 ACTACAAAGT CAAAAAAGAC AGTGTAAATA TAAACCTTAA AAAAGGCAAG 
2 49401 ACCTCCCTTG CAGGAGACGA TTAGTTCTAT ATAATCGAAT TAATCGTAAA 

2494 51 ATCCTATACT 'TCCGTTCTAG AGATGTATAA GGCCTCTTTT TTAACTTTTC 
249501 TATATTACCA GATTTTATAA AATACGTGTG GTAATTATCA TAAATGTGCC 
249551 TTGTCATATG TAGAGAAACA GGATTTGGCG CATAGAAAAC ATATCGTTTT 
249601 TTTCATAAGC ACCCCATAAA TTCTCGTCCC AGATGGGCTA CATCAGCTTT 
24 9 651 GTTCTTAAGC CTGATAGATC TTTTTGGTTG CTGCTCAAGA ACCAAGTGCA 
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249701 
249751 
249801 
249851 
249901 
249951 
250001 
250051 
250101 
250151 
250201 
250251 
250301 
250351 
250401 
250451 
250501 
250551 
250601 
250651 
250701 
250751 
250801 
250851 
250901 
250951 
251001 



TTCAGAAGTA 
GGTTTCTTAG 
GTTATTACTG 
CCGTATAAAT 
TAGAGAAACT 
ATTAAGTGTC 
AGTTAGATAA 
TTCAGGTCCA 
TGCATCCTCT 
CTTATGACTA 
TCTTGGGCCA 
GGACCAAGAC 
CCTTTTATTT 
CATAGCTACA 
ACGATGAATT 
GCTTCTCCTA 
TTCTGCTTTA 
ATGTAGTTCA 
CGGGCGCAAA 
AAAAATTTCT 
AGACTCAAGA 
GGCCATAAGC 
GTCGGGCTAT 
GAGTATTTGC 
ACTTCTGCAG 
AGGCTAATGC 
GAGAGAATGA 



GAACCATACT 
CTCTAAGAAA 
GAAGATGAAG 
TAAAACTAGA 
GACGCATAGA 
TTTTTTCTAG 
GGAAATTATA 
TCTGGATATA 
TTTATTTGAG 
CAACAGAAGT 
AAACTTATGA 
ACTAGCTCAA 
TGAAATCAAA 
GGAGCTTCTG 
TTGGCAAAAA 
TTTTTAAAAA 
GAAGAAGCTC 
TCGTAGAGAT 
ACAATGAAAA 
GGAGATAGCA 
AATTACAACT 
CAAATACGGA 
ATTGTGACTG 
TGCTGGAGAT 
GAAGTGGTTG 
GATTGCAGTT 
CTTTAGAAAT 



TATTGAGGGT 
ATTCTAAAGA 
ATCTTGAGAT 
CAAAAGCTAT 
CGCTAAGTGT 
TAGATCAAGA 
AATGATTCAT 
CAGCGGCAAT 
GGGTTTTTCT 
TGAGAATTTT 
ATAATATGAA 
GATATTATTT 
AGAAGAAACC 
CTAAACGTTT 
GGAGTGACTG 
TAAAGATCTT 
TTTACCTGAC 
AAACTGCGGG 
AATTACATTT 
TTGTCCGTTC 
AGAGAAGCTG 
TTTTCTCGGA 
AGAAAGGAAC 
GTTCAGGATA 
TATAGCAGCA 
GCTGTGGCAT 
TCCAATTTTT 



TAAATAATAG 
TTAATCACAG 
TTGAAACTTT 
TCAAAAGCAA 
TGTTTATTAG 
ACAGTGTTCT 
TCCCGGTTAA 
TTATGCATCA 
CTGGGATCTC 
CCAGGGTTTC 
GGAGCAGGCT 
CCGTAGATTT 
TATTCTTGTG 
AGAAATTCCT 
CTTGTGCCGT 
TATGTGATTG 
TCGTTATGGA 
CTTCTAAAGC 
TTATGGAATA 
CGTAGATATT 
CGGGGGTGTT 
GGACAGCTGA 
GTCCAAGACT 
AGTACTATCG 
CTAGATGCTG 
ACTCTTTGCA 
GCATAGACAT 



GATGGGTCTT 
AAATGTAGGT 
AGTCTTAAAT 
GTTTTTTGCA 
ATAGGCTAGA 
CAATTTCTAA 
TTATTATTGG 
AGAGCGCTTT 
TGGTGGCCAG 
CTGAAGGGAT 
GTGCGGTTTG 
TTCTGTTCGC 
ATGCCTGTAT 
GGAGCAGGAA 
TTGCGATGGG 
GGGGAGGGGA 
AGCCACGTAT 
TATGGAAGCT 
GCGAGATTGT 
AAGAATGTTC 
CTTTGCTATA 
CGTTAGATGA 
TCTGTCCCTG 
TCAGGCGGTT 
AAAGATTCTT 
GTGGCTTATA 
GCGAAGGGAG 
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251051 
251101 
251151 
251201 
251251 
251301 
251351 
251401 
251451 
251501 
251551 
251601 
251651 
251701 
251751 
251801 
251851 
251901 
251951 
252001 
252051 
252101 
252151 
252201 
252251 
252301 
252351 



GAGAACTTCG 
CAACAACGCT 
CCAGCAAAGC 
ATATTTCTGT 
GAGTTGCAAT 
ATATGAATGA 
TTAGGGCATT 
GCATAGCCCA 
ATAACGACGA 
ACCTTCTAGA 
TGTGAAAAGA 
TAGCAAGTAG 
GAAGACAGTA 
GCTTTTAGCA 
TAGTATCAGG 
TTCTAAATAC 
TAAGATATGC 
TGTCTGGTCT 
TTTTTACTGT 
CTTTCCTATT 
ACGCGTGGCT 
CTGGAGCTAG 
CAACATCCTG 
TGGAGGCGTT 
ATAAAAAAAA 
GGATCAGTTT 
GAATCAAGAA 



GGTCCGTGAG 
CCCTATGCCA 
GACCTGCAAA 
TCTGCTTCTG 
TGCCTCGCGA 
TTTCCATAGA 
CTGTAATTTT 
CTGTAGCTCG 
AGGACTAAAG 
GATAAAGGTC 
ATTCAGAAGA 
CGATGTTTTA 
GGTAAAAAAA 
AGAGTTTCAT 
AAATATTTGA 
TCTATTTTAC 
GAGCTGAGAT 
TTCGAGCCAT 
AGGGATTTTT 
ATGGTTTGAA 
TTAGAAAACT 
ACTTGCCTAT 
AAGAGATCAT 
CTTGGCTTTC 
GATTTCAAAA 
TCGGAATTGC 
ATTGTAGGAA 



ATACTTTAAA 
GTTCCTAAAG 
TGAAGGGATG 
TAAAGATTCT 
ATGCGGCTAA 
GTTCGCACTA 
GAGAAGTAAG 
AATCACGAAT 
AGAAGGCAAT 
TGCTTTATAT 
TAGACTCAGC 
GGGATGCTAA 
GAAGAAAGGG 
TCCTGCTGAT 
TGTTTCTTGC 
TTGGAGTATG 
GGCTGTGATC 
GGTCTCTAAG 
CTAGCATGTC 
AGATCATTTA 
TTTTTATATA 
GTGATTTTTT 
TCAAATATGG 
TTTTGTGGGC 
TTGACTTTTC 
AGCGTTTTTT 
CACCGACTTC 
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GACTTCGATG 
CTTTTGCTAC 
GGATCGGTCT 
ATTGAGTAGT 
TTTCAATAAT 
ATTTCCTTCA 
CGATGGCAGT 
AAGGCTTGAG 
CATAATGACC 
AGGCAATCCA 
TGTTTCCTTA 
TCCCCAAGAA 
GGTCTAGGGC 
ATCCACAAAA 
CATAGTGCAT 
AATTTTTTGC 
TATTGGGATC 
ACTTACTTGG 
TCTCAGCAAG 
AGTTTTTCCA 
CTCTATTTTA 
ATGGATGGAG 
CACGGAGGCT 
GGCCATTTTT 
TCTTCCTTAC 
ATTCGTTTGG 
TTTGCCTTGG 



TCTTTCCAGG 
AGCTTCTTTT 
TTTCTAAGCA 
CGATTGCCGT 
ATCGGTTCCT 
GGATTTTCCA 
TTTAATGAGA 
GATTGCTGAT 
ATGATGAGAA 
TGTAATTTTT 
TAGTTTTGCT 
ACCATAGATA 
TATTGCGGTA 
TACCAGGAAA 
TACACTTAAT 
TAACTGGTAT 
GCTCAAAAAT 
TATGGCGTCT 
GTATTTGGCT 
AAAGCCAGCT 
TTTATTGTCC 
TTTTTACTTA 
TGTCGAGTCA 
TCTTGGATAT 
AGACTTGTGT 
GTAATTTTTG 
GGGGTGGTTT 



252401 
252451 
252501 
252551 
252601 
252651 
252701 
252751 
252801 
252851 
252901 
252951 
253001 
253051 
253101 
253151 
253201 
253251 
253301 
253351 
253401 
253451 
253501 
253551 
253601 
253651 
253701 



TTTCTGATCC 
CTTTATGAGG 
TTCCTATAAG 
CCTGTATTTC 
CATCAAGGGA 
TTTATCTATA 
CATTGAAAGC 
ATGAGACTTT 
AGATTCTCCT 
GTTTAATTTT 
TCTTTAATTG 
TGAATTTCGT 
AACAGACGCT 
GACACCGATG 
TGGAGACAAG 
ATTCTTCTGG 
AATATTCACT 
GAATACGGGA 
TAGTTGTTGA 
TACGCACAAG 
GCTTGTCTTT 
ATTCTCGAGA 
GTAATTTTTG 
CCACTATGTT 
GTGGTTCTAT 
AGCATTGTGA 
TCCTGAAGCT 



TATGCAAGGT 
GAATCAGTTA 
CGTTATTTGC 
TGTCGCTTTC 
AAGTTTTAGC 
CCTTTATTTC 
TCGAAGGCAC 
AGAATTTTAC 
TTGGGAGATA 
TAGAGTTTAT 
GGATTGCTTT 
TCCTGCAAAA 
AGCTGCAGTA 
TAAACGGAGA 
TTATTTTTAT 
GGAATCTTGG 
TGGCTTTATA 
AAAGTTTTTC 
GTTTCGTAAC 
GAAGAATTTC 
TGGAGATCAG 
AGAAAAGTTA 
GTAATGACCA 
TTATTTAATG 
AGAAGGTATC 
ATGAAATTGG 
CTTTTCCCTG 



GTCCAGGGAG 
CTTGGTCGTC 
ATTTAGGTAA 
ATTCGTTTTT 
AGAGGATTGT 
TATTTGGTGT 
CGTTCACACA 
AGTCTATCAC 
ATGTGGGTTT 
CAAATGAATA 
TGTAGGATGT 
ATCTAGCAGA 
GAATCTGTAG 
AGAACATAAG 
TACATAATGG 
AGCTTTGTAG 
TCGTCAGCAG 
TTCCTACGAA 
AATAAAGAGC 
CAATAAAGAT 
GAAGCGACTA 
GTTTCTTTGG 
AGATTCGGCA 
ATTACATGCA 
AATTTACCTT 
TTTTGATAGG 
GGCTGTCTTC 

188 



TTCCTGTGCA 
TCTGGAATTT 
GGGATATGTG 
TTGCGGAGTA 
CTACTTACAA 
GGCCTTACTT 
TATAGCAGCA 
AATAAGAGAC 
ATGATATTTT 
AACGCACTTT 
CAAATATTTT 
GAAACAAAGA 
GGTTAAGTGT 
AATAACTACG 
AGAAGCTGCT 
ATCACAAGTG 
GGTTCTTCTT 
TCATGAAGGT 
CTTTAGTATT 
AGCACGATCT 
TATTCCTTTA 
ATCTTCCTAT 
AAGTCGTCAG 
GATTATTGTT 
TTGCTTCAAC 
GATTTAGCTT 
AAAACTTCCT 



TCCTGTGCAG 
TATATTTTCT 
ACTTCTATAG 
TGTAAAAAGC 
TTGGTCAAAT 
ATCATTTGCT 
TCTTTTTGGC 
CTGTATTGAG 
GACCTCTTAA 
GCTTTTTGTT 
TTGGTTATAA 
AAGATTTCAG 
AGCTTCATGG 
CAGTTCGTGT 
CAGTCTGTTT 
TGGTTTCGAT 
TCAATCCTAC 
TTACCTGTAC 
TCTAGGTGAG 
TTGGTACAGC 
GGTCTCTATG 
TACACGAGCT 
ATACTGCGAA 
TCTGAAGAGA 
AAATAATAAA 
CAGAGAAATC 
GATGGCCAAC 



253751 
253801 
253851 
253901 
253951 
254001 
254051 
254101 
254151 
254201 
254251 
254301 
254351 
254401 
254451 
254501 
254551 
254601 
254651 
254701 
254751 
254801 
254851 
254901 
254951 
255001 
255051 



AAGCCAAAAA 
TTAAGTGATT 
GGTTTCAGGA 
TTTCCTATAC 
CAGAAGGTAT 
TGAAACTGCA 
CAGGAGTTCC 
AAATACAGGG 
TCCAAAAGTA 
GGATTTTAAA 
GAAATTGCTT 
GACAAGATTG 
AATATCCTGG 
CATCGATTTT 
ATTAGATAAG 
ATAGCATTTC 
GCACTCCTAT 
GGGAATTTCC 
CTTTAAATGC 
CCTTATATTC 
TCAGATGGAA 
CGGGTTGTTT 
GATTTATTAA 
GTGGATTGAT 
CGATATGGTT 
ATAGTGATGT 
TGTTACGGAT 



CTCGATTGGA 
CTAAGAAATT 
AGAGAGCTAG 
CCCCCATTCC 
ACAAACTTCC 
ATTACTTTAA 
TGAAGTGGAG 
TTATCAAAAA 
AAAGAGCCTT 
TTCGAATGGA 
CTGGCTATGG 
TCTGCTATTT 
AT AT GAG AC C 
TAGTGTATGC 
ACAATTACTC 
TTTCCGTGGT 
TTATTATTAT 
ATTATTTTAC 
ATGGTCCATA 
AGCAAATTCA 
ATCATGGGCT 
ACCTTTATTG 
AGTCATCATT 
AACTTAACAG 
TATTGGAAAT 
TCTTACAACA 
CAGCAGAAAC 



GGTTACTACC 
ACTTCCTCTA 
CGACTCCTGT 
ATTCAATTGG 
AGAGAATCCG 
CGAAAGAAAC 
ATCATGTCAA 
AAATAAGGGG 
TAGCTATACG 
TATTTCGGTA 
ATCTCTCTAC 
CTCCTAAAAA 
TTGCTTCCTT 
AGGTCCCTTG 
AGGAGAAGGG 
GTTTTTGCAT 
GAAGTTCTTC 
TTACTGTATT 
CGATCTATGA 
GCAAAAGTAT 
TGTATAAGAC 
ATACAGCTTC 
CTTATTACGA 
CTCCTGATGT 
GAGTTCCACT 
GAAGGTCACG 
AGCAACAAGT 



CTTTATTGCG 
GAGTATCACG 
GGCTTTAAGA 
AAAGCTTAGA 
GAAGAAAAGC 
CGAAGATGTA 
ATGCTTCAGC 
TCTTTAGATA 
TCGTGGTGTT 
TTATTTTAAC 
ATTTCGGGTT 
TCAACTGTAT 
TGCCAAAAGA 
GCAGAGCCTA 
AGAAAATCCT 
TTATTACAGC 
AAATTGGTTA 
TTTGAAATTG 
GGCGTATGCA 
AAGAACGAAC 
AAACAAAGTG 
CTTTCCTAAT 
GGAGCCTCGT 
GTTGTTTTCT 
TACTTCCTAT 
AGTTTGCATA 
TATGGGGAAC 



CAGGGGATTA 
CATTAAATGT 
TACCGAGTTC 
TAGATCGGTT 
CCTATGTTTT 
TGGGTAACTT 
CCCAACCATT 
AAGTTAAGCT 
TATCCTCAAT 
TCCGTTGTCT 
CTACGGCTCC 
CCAGTATCAA 
TGCAGGGACA 
CACTTAAAGT 
GAGTATCTTG 
TCCTTTTGCA 
CGGGTTCTTG 
CTTCTCTATC 
GATTTTATCT 
CTAAGCGTGC 
AATCCTATCA 
TGCGATGTTT 
TTATTCCTGG 
TGGCAGACAT 
TCTATTAGGT 
AGAAAGGACC 
ATGATGGCGA 
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255101 
255151 
255201 
255251 
255301 
255351 
255401 
255451 
255501 
255551 
255601 
255651 
255701 
255751 
255801 
255851 
255901 
255951 
256001 
256051 
256101 
256151 
256201 
256251 
256301 
256351 
256401 



TTTTATTTAC 
TGGCTTTCGT 
GATATTAGAT 
AACATCGATA 
TGAAACAAGC 
TTTCTTTTGC 
GTTGCGATCT 
AAGCTCAAGA 
AAGGTTAAAT 
CGTTACTTCG 
AGCAAGAGAA 
GAGATGACCT 
TCGTGTTTTA 
CTTTTAATCC 
TTAATGCAGT 
CTATGTTTCT 
CAGGAGAAAT 
TTCATTGAGG 
GTTCTTCCAT 
TGTCTTCATC 
ATCAGCAGGT 
GGAAGGATTG 
GCATTCAAGA 
GTAAAGACCT 
TAAAAAACTC 
TAAAAGATGT 
ATTATTCGTA 



CGCTATGTTC 
CTATGATTTT 
AGCAAACATC 
ATACTGAAAA 
GATTTATATG 
TACAAGAGAA 
TTAAAGGTCT 
TTCTTTTCAA 
CTGGTCTTGT 
GTAGATAAAG 
GACAGCATAC 
TCTCTAATTT 
CAGGAATTTA 
AATTTATCTG 
CAGCTATCAG 
TCGGATTTGT 
GCAAAAATTC 
ATATCGAGGT 
ACGTTTAATT 
CTATGCGCCT 
TTGAATGGGG 
CGCAGTTTCT 
AACGGCCTTA 
TACTGCATGC 
TCTCACCAAT 
TTTAGAAGCA 
ATGTTGCTCA 



TATAACTTCC 
AGGAGTCGTC 
TTAAAAATGA 
AAAAGTCGAA 
AGCTCATAGT 
AGAAATTGGC 
TATGTTTTGA 
ATTACTTGGT 
AAATAATAAC 
CAGCTCCTTT 
TTTACCATGC 
TTTAGTTACC 
CTAAGAGTCC 
TTTGGACCTG 
TGTTCTTCGT 
TTACAGAGCA 
CGTTCTTTTT 
TTTTTCAGGA 
CTCTTCATTC 
GTGGATCTCG 
AGTTGCAATT 
TAATGAGACA 
GATTTTTTAA 
ACTGAATCTT 
TACTATATGA 
GCAGGAAGCG 
ATATTATGGG 
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CTTCAGGATT 
CAGCAGTGGA 
AGTGGTTTTA 
TTGTCTTAAT 
TTATGCGAGC 
ACAAATACTG 
TGCTTGTAAT 
TTGAGGAGCA 
AATAAGCCCA 
TTATAAGGAG 
ATTATGGAAG 
CCTGAAAATG 
TGATGAAAAC 
AGGGATCTGG 
GAATCTGGAG 
CTTAGTCTCT 
ACCGCAATAT 
AAGTCGGCAA 
TGAAGGGAAG 
TTGCTGTTGA 
CCGATACATC 
GGTAGAGCGC 
TTTATGCGCT 
TTAGCAAAGA 
AGATGATGTG 
TTCGTTTAAC 
GTCTCTCAGG 



AAACATCTAT 
TCACTAATAA 
AATAATAAAA 
ACAGGCTTAT 
ATGGGAAGAA 
TAGACAAGTG 
TTGTATCTTG 
TATAAGACAT 
TTCGTGTTCA 
AAGCAGATGC 
TGTGAATCCT 
ATCTTCCTTT 
GGAGGAGTTA 
AAAAACTCAC 
GTAAGATTCT 
GCTATCCGTT 
TGATGCTCTA 
CTCAAGAAGA 
TTGATTGTAG 
AGATAGATTG 
CTTTGGTTCA 
TTATCTATTC 
ATCTTCCAAC 
GGGTAATGTA 
AAAACTCTTT 
TCCTTTAAAG 
AGAGTATTTT 



256451 
256501 
256551 
256601 
256651 
256701 
256751 
256801 
256851 
256901 
256951 
257001 
257051 
257101 
257151 
257201 
257251 
257301 
257351 
257401 
257451 
257501 
257551 
257601 
257651 
257701 
257751 



AGGACGTTCT 
ACTTTTGTCG 
TTTTCAAGAG 
AAAAATAGAA 
CTAAGAATTT 
GAGATGATTA 
GTTGCAGCAG 
TCTAGCTTGT 
TACCAGCAAG 
AGGGTGACTC 
GAGGAGACCT 
AACGATGTGT 
GGTGTTTGTG 
AATTTTAATC 
ATTTTAATTT 
TAGTTCAGAA 
GCGAAGTTCT 
GTATTTGAAA 
AAAGTCAGTG 
AAGTGTTGTA 
CTGTTGCAAA 
CAAAATGGCT 
GGAATTTCTC 
TTCTAGATTT 
CACCTTGTAG 
GAGACTGGCA 
CCTACGGCTA 



CAGTCCCGAG 
TCAGAAGCTT 
ATCATTCGAC 
GAAAATAGCC 
AAATTCCTTG 
TTTAGAGGAG 
CCGGCGCAGA 
TTAAATTGTC 
TACTATCTGT 
CACTGGCAAT 
AGACTAACTA 
TGCCTTGTGA 
CTACTGTAGC 
TGTCTATATA 
TATTAAAATT 
CTATTAAGCT 
TT T"* T T T T 
TGGTTGCATT 
TTGACTCTAT 
CAATCACCAG 
AACTGTTATG 
ACTATAAAAA 
•TTAGGGAGAA 
AGTAAGGAAG 
AAATGGAATC 
GCTTGTTACT 
CGGTATACGC 



AATATGTATT 
TCACTATCAT 
GGTAATCTCA 
ATGATATTCA 
CATAAGAGTT 
TAACGATCCT 
TATCATTTTC 
TAATATGATT 
ATAGCAAGAA 
AACCAGTCCT 
AAATAGGCTT 
GATACTGATG 
CATGTTGTTC 
ATTGTTGTCA 
TATCAACTAA 
CTCTTAATAG 
TCAAATTTCG 
GTGGAAGATT 
GAAACGGGCG 
AGTCCGCATC 
GAATGGTTGG 
TGATGTAAAA 
GTCTAAAAAG 
GCACTAAAAA 
CGATGCAGGA 
TGGATTCTAT 
TATGATTATG 



GCCACGTCAG 
ACGTGAGAAT 
TCCATACGAT 
CATGGCTATT 
TGGAATTTTT 
CTAATTTTTG 
ATTGAGTTCA 
GAAAGCAAGG 
CAATGCCGAT 
AGGGTGGTTA 
AAAAAATACC 
GTTGTGGTTG 
TCCTGAGTAA 
ATAATATTTT 
TTTTATTTAT 
GCTCTCTTTT 
CATATGGAGG 
TTTCGAGTTT 
ATTTTAGATC 
TCCTAGAGAT 
CCAAGGGGTG 
AGAGTTTATT 
CAATCTTTTG 
CTTTAAATTA 
TTAGGAAATG 
GGCTACATTA 
GTATTTTTGA 



GTAGCCATGT 
AGGCGATGTC 
TGATTGAACA 
CAAGATATTT 
CCCTAGCGAA 
CTTCTGGACA 
ACTCCGCTGT 
ACAAGAGCAA 
TCCTAAGGAT 
AGGAAGCAAG 
CAGGAACAAT 
TGTAGTCTGA 
AAAATTAAAT 
TATAGATATT 
AGTCTTTCTC 
ATTTACCAAG 
GTTCTAGATG 
TGATAAGAAC 
GTCTGTATTT 
ATCTTCACAG 
GCTGAAAACT 
ACCTTTCCAT 
AATTTAGGAA 
TGACTTTGAC 
GTGGTTTGGG 
GCAGTTCCAG 
TCAGAGGATC 
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257801 GTCAACGGGT ATCAAGAGGA AGCTCCTGAC GAGTGGCTAC GTTATGGAAA 

257 851 TCCTTGGGAA ATCTGTAGGG GAGAGTACCT CTATCCCGTA CGATTTTATG 

257901 GAAGGGTCAT TCATTATACC GATTCTCGAG GGAAACAGGT GGCAGATCTT 

257 951 GTCGATACCC AAGAGGTATT GGCGATGGCT TATGATATTC CGATTCCTGG 

258001 GTACGGTAAT GATACTGTAA ATTCTCTAAG GCTATGGCAA GCACAATCTC 

258051 CGCGAGGCTT TGAATTCAGC TATTTTAACC ACGGGAACTA TATCCAGGCT 

258101 ATAGAAGATA TCGCCTTGAT AGAAAACATC TCTCGCGTCC TCTATCCTAA 

258151 TGATTCTATT ACTGAGGGGC AGGAATTGCG TCTCAAACAA GAGTATTTTT 

258201 TAGTTTCAGC AACCATTCAA GAT AT TAT CC GCAGATATAC AAAGACACAT 

258251 ATTTGTTTGG ATAACCTTGC GGATAAAGTC GTAGTACAAT TAAACGATAC 

2583 01 CCATCCCGCT TTAGGGATTG CTGAAATGAT GCATATTTTA GTCGATAGGG 

258351 AAGAATTACC TTGGGATAAG GCTTGGGAGA TGACTACAGT CATCTTTAAC 

2 58401 TATACCAATC ATACAATCCT CCCAGAGGCT TTAGAGAGAT GGCCTCTCGA 

258451 TTTATTCTCT AAGTTATTAC CTCGGCATTT AGAGATTATT TATGAAATAA 

2 585 01 ATTCCCGTTG GTTAGAAAAA GTTGGCTCTC GCTATCCTAA AAATGATGAT 

2 58551 AAGCGCCGGT CTTTATCCAT TGTTGAAGAA GGGTATCAAA AGCGTATCAA 

258601 TATGGCAAAC CTTGCCGTAG TAGGTTCTGC AAAAGTAAAT GGAGTTTCGT 

258651 CATTCCACTC TCAGCTGATT AAAGATACTC TCTTTAAAGA GTTTTATGAG 

2 587 01 TTTTTCCCTG AGAAGTTTAT CAATGTGACC AATGGGGTGA CTCCACGACG 

258751 ATGGATTGCT CTCTGTAATC CTCGTTTGAG TAAGCTTCTC AATGAAACTA 

258801 TAGGGGATCG TTATATCATT GATCTTTCTC ATCTTTCATT GATCCGTTCC 

2 58851 TTTGCCGAAG ATAGTGGTTT CCGAGATCAT TGGAAAGGGG TAAAATTAAA 

258901 AAATAAGCAG GATCTAACAA GTAGAATTTA TAATGAAGTT GGAGAAATAG 

258951 TAGACCCTAA TTCTCTCTTT GACTGTCATA TTAAGCGTAT TCATGAGTAT 

2 59001 AAACGACAAC TAATGAATAT TTCTTAGAGT CATCTATGTT TATAATGACT 

259 0 51 TGAAAGAAAA CCCTAATCAA GATGTCGTCC CTACAACAGT AATTTTTTCT 

2 59101 GGTAAGGCGG CTCCTGGCTA TGTCATGGCC AAACTCATTA TCAAGTTAAT 
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259151 
259201 
259251 
259301 
259351 
259401 
259451 
259501 
259551 
259601 
259651 
259701 
259751 
259801 
259851 
259901 
259951 
260001 
260051 
260101 
260151 
260201 
260251 
260301 
260351 
260401 
260451 



CAATAGCGTT 
TTAAGGTTCT 
ATTCCTGGTA 
TTCTGGAACA 
GAACTATGGA 
AATATGTTTA 
GGAATACTGT 
TTTTAGATTT 
TTTAAACCGA 
CTTGGCTGAC 
TCTTTAAGGA 
GGAATGGGCT 
TATTTGGCAT 
TAGACAGGAT 
GAGTCCTTTT 
TTCTAAGATC 
CAATCACTCT 
GTAATTTCTC 
AAGAATCGCC 
CTGTCATTCC 
TCTTGAAGAG 
TGAAATCATG 
TAATTCCATC 
CGGACGATTT 
CTCTTTGAGC 
GTTTGATTCC 
GCGTAGACCT 



GCTGACGTTG 
TTTTTTACCT 
CAGATCTTTC 
GGAAATATGA 
CGGTGCAAAT 
TTTTTGGTCT 
CCTCAGACAA 
GCTAGAACAG 
TAGTACATCG 
TTGGAGTCTT 
ACCAGATTCA 
TTTTCTCTAG 
GTTCCTACAA 
GGAATCAGGG 
TGTTCAAAGA 
TTTTGTAATC 
ATGATCTACA 
CGTCAAGAAC 
GCTTGAGGAG 
TAAGTTAGAG 
ATTGATTTCT 
CCGAGATTTT 
TGGAATGGCC 
TATTATCGAC 
GCCAGAGCAC 
CTGAGCTTGA 
GCTGCCTTAC 



TAAATCAAGA 
AACTATCGAG 
AGAACAGATT 
AATTTGCTTT 
ATAGAAATGG 
TTTGGAGGAG 
TTTGTGATAA 
GGATTTTTCA 
CCTACTGCAT 
ATATCGCTGC 
TGGACTAAGA 
TGACAGAGCC 
AATCTTGCTC 
ACTCTTTCCA 
TTGCTAGTTT 
GTTTCATAAA 
GATAGGGTAA 
AAGAGCTTGT 
GATTGACAAT 
ACACAGAAGG 
TGCTTTTAAC 
TACGGTCTGC 
ACAGCTATCG 
ACTGTTAAAT 
AGGCACGTAC 
AGTTCTTTGA 
ATAGAAGTGA 



TTCTCGAGTT 
TTTCTATGGC 
TCTACAGCTG 
GAATGGAGCT 
CAGAGCATAT 
CAAATTGTAC 
GAATCCTAAG 
ATAGCAATGA 
GAAGGAGATC 
CCATGAAAAT 
TTTCTATTTA 
ATTCAGGATT 
TGGAGAAGGA 
TAGCCAAGGA 
AATAGTAGGA 
CATCGCAGCA 
GATTGCAGGT 
TCTGTAACAC 
CGCTGTAAAT 
ACCCTCCTTT 
GCTAAGCTCT 
GCAGCGTATA 
AGATATCGAT 
CCTGAATTGA 
AATGCAATCG 
GCAGATTAAG 
GGAATAGAGA 



AATGATAAGC 
TGAGCATATC 
GAATGGAGGC 
CTGACTATAG 
TGGTAAGGAG 
AACTGCGGAG 
ATCCGTCAGG 
TAAAGATCTG 
CCTTTTTTGT 
GTGAACAAAC 
TAATACTGCA 
ATGCCAGAGA 
AATTAAGAAA 
GCTATAGAAA 
CAGCCGGAGC 
GGATAACCAT 
AGATCCTATA 
TTCCTACGGC 
TCAGTGATTC 
GTATTCAGTG 
TAATTTCTGC 
ATTGGCGTAA 
AGTATCAAAA 
TAGAAGGGAA 
TTAATAGAGA 
GAGAGGTGAG 
TCTTAGCAGC 
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260501 
260551 
260601 
260651 
260701 
260751 
260801 
260851 
260901 
260951 
261001 
261051 
261101 
261151 
261201 
261251 
261301 
261351 
261401 
261451 
261501 
261551 
261601 
261651 
261701 
261751 
261801 



TTGTAGGCGT 
GATAGGAACC 
ATGCTTTTAG 
AGGACCACTC 
GTTGTCTAGC 
ACTAAAGGCG 
TGTTGCTGAG 
AACCTTTTGG 
TCTTCTAGAT 
GGTGCCTATA 
AGCCATCTTC 
TCTACAATGA 
TTTCACTATA 
ATAAGGAGAT 
TCGGTTAACA 
TAGGCGTTTC 
AAAACATGCT 
GTAGTGGCCC 
ATAAAATTGT 
TCTATAGACA 
AATGGAAACC 
GAACTCTATG 
TCCCCTTTTA 
ATTATTTCTG 
AAGGGGCTAT 
CAATGAGAAT 
AAAAACTATA 



GCAGCAATCA 
TGGAGGCACT 
GAGGAGCTTT 
CCTTGAATTG 
TAATGGAGAG 
AGGAGAGAGG 
GCAGCTTGTG 
AGATGCTTCA 
TAAAGGGCTC 
ACGATTTTCT 
ATTTGCTGTA 
CGTCTCCAAA 
GTGCCCACTT 
CACAAACTTA 
TTAGGCAAAG 
TTTTTGGCAT 
CAGTAATCAG 
TCTTCAATTA 
TGATATGTCT 
AGCCCCAACG 
ATACGGCTAT 
TGCTTTCCCA 
AGTTATATTC 
ATTGCTGATT 
AATAATAAGA 
GCTGGCAAGA 
GGAACGGAAA 



CTTCCCGAAT 
TCGGGAGACT 
CTCTAAATCT 
ATGAGACATC 
AGATTATTCG 
TGGCTCTGGC 
GAGTTGTTGC 
AGGTTAGAAG 
GTTGGCTTCT 
CGCCTTCATG 
TGTTCTAAAA 
ACTGACCTGA 
CCATAGTTGG 
CCTCATGACT 
TGGCCTGTTC 
ACCCTTAAGG 
GGCAATAATT 
CAATACAGCG 
AAAGGTTTGA 
TTTTTTGGCT 
AAGTAATAAT 
ATAGGAACGA 
TAGCTCGTTT 
TTAATAAGCC 
CCTGGAATAT 
TACCTGGGCT 
ACTTCCCTCC 



CGGAGAGAGA 
CAGGATAGCC 
TTTTTTACTA 
TATGTTTTTC 
TAGTGCCTAC 
TTAAAAGTTA 
AGGCGAGACC 
GTTCTGTCTT 
GTAGAGAGTA 
ACGTAAGATT 
TAGCTTTGTC 
TCATTACTTT 
AGAAAGCTTT 
TTTTCAATGG 
TAAGATTTTA 
GGGGAGCATC 
TCAGAAGAAA 
TGAAGTTTTT 
TCGTTCTTAG 
AGAGAACACG 
TGTAAGGTCA 
GATATTCTTC 
TCTAAAAAAA 
TTTAGCGTCG 
TAGCATACAA 
GCAGCACCAT 
AGTCATAAAA 



TTCTCCTCAT 
AAAACCAGCA 
TACGTCCTCC 
TCTTTTGCTA 
GTGTTTGAAG 
CTGCTGTGAA 
TCTTCAGAAG 
AGGAAGAAGT 
CCGCAATAGG 
TCACGAATCC 
TGTAGAGATC 
TTTTATGCCA 
GGCATTTTCA 
TATCTAAGAT 
CTATAGGGCA 
AAGAGAATCA 
TCCCAGCGAA 
CGTACCGATG 
ATCAATAATT 
CTTCTTTTGT 
TTTCCTTCTT 
GGTGGGGACT 
GAACGGGGTT 
TAAGGGTTCG 
CGACTCAACG 
TAGGGCCACG 
TG CATC T TAG 
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261851 
261901 
261951 
262001 
262051 
262101 
262151 
262201 
262251 
262301 
262351 
262401 
262451 
262501 
262551 
262601 
262651 
262701 
262751 
262801 
262851 
262901 
262951 
263001 
263051 
263101 
263151 



CTGCATGAGA 
ATAAATTCTA 
TCCAGAGAAG 
CCCATTTATC 
CCAACCTCTT 
GTCAATTGCT 
GCATAGACTC 
TGCGTTAGAG 
TTTGAAATTC 
GCTAGGACAA 
TCTATATAAA 
GACACTCAAC 
TCTCTAAATC 
TGCACGGATA 
CAACAGCACG 
ATAAGCATTA 
GAATACACCT 
TATTTTTTTG 
GGAATTTGTC 
CATATGCATG 
AAAGTTCAGC 
GCGTGGCAGC 
GATTGCAGCA 
CACCCACTAG 
CGAATCAGAA 
AAGGTCTAAG 
TATTATAAGG 



AATGATTTGG 
TAATAGGGCG 
GCTGCTTCAC 
TAATAAGCCT 
CACCAAGAAT 
TCTCGGAGAG 
CTTCCTCTAA 
AACGCTTCTA 
CTCTTCAGTC 
TAGGATCTTT 
TTAGGATCTG 
TAAAACCGGA 
CTAAAAGAGA 
TCGTAGGAAC 
ATTTAATGAC 
GAGGGAGTTG 
TGAGCTACCG 
TTCTTGATAT 
CTCCGACAAT 
GATCCTCCAC 
AGCAATTTCT 
GGTATGAAGA 
GTTGCTACAG 
CCCTTCTAGA 
CCATCTGTTT 
ATCCTTTCTA 
TGCTGAACTA 



TCCAAGGCTA 
CAGGCCTGAC 
TAATAGGAGC 
TTGGTGACTT 
ACAGACATTA 
CTTCTCGAAT 
TGTGGTGACG 
AAACAGCAGT 
AGAACCTCTA 
TTTAAATAAA 
ATATAGAATG 
GATTCGGTAT 
GTTAAATAGA 
TTCCTTGAGA 
GTTCCCATAC 
GTGAAGAGAA 
CACCATCTCC 
TTGATGGTAA 
ACCAAATCCT 
GACCTAAAGC 
TGAAGGGGAA 
GAACACCCAG 
CTTCTTGGCC 
TAGGCTTCTT 
TAAAAATTTA 
CTGTGGATTT 
TCCATAACTT 



CAAAGGAAAA 
AATGCGGCTC 
ATCAATGACT 
TATAAGCACC 
GGATCGCGAG 
TTCTAATGTT 
GATGGATCTG 
TTTGCATTCT 
ATCGAATTAG 
CACTGCATTT 
CCCTCGAAAT 
CAACCATATA 
TCAAAACCAT 
CTCTGCTATG 
TCCAGCCGTT 
ACAAAGTTCA 
GATAAAGCAT 
ATGCGGCTCC 
CCAGGGAAAT 
GCATCCAGTT 
TGTTGAGAAG 
GGATCTAGTC 
AGCGTAAGAG 
CTCCTCGGGC 
ATACAGGAAG 
CTCTGTGCCC 
TTTTATAAGG 



GTTCCAGCTC 
CTATTCCAAT 
CTCTTAGGGC 
ATTGTAGTCA 
ACATCTCTTC 
TTATGTTTAG 
ATGAGAGTTT 
TGGCGTATAT 
CCAATCTTTA 
CTTCTTTCGA 
CGGGAGCAGA 
GCGATAAGCC 
TGACTGTGAC 
GGCTGTTTTG 
ATTTTCAATA 
GAGTTTCATG 
AGAGAAACTC 
AGCTGCGAGG 
TAGGCCCACA 
TCTTTCCCTA 
AATCGCAAGT 
CTGTGTTTGC 
TGGTAAAATC 
TTCGAATTCA 
CGGGCCCGTA 
TGAGAAGCTA 
GAGACGCTTC 
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263201 
263251 
263301 
263351 
263401 
263451 
263501 
263551 
263601 
263651 
263701 
263751 
263801 
263851 
263901 
263951 
264001 
264051 
264101 
264151 
264201 
264251 
264301 
264351 
264401 
264451 
264501 



GGGAGCGGTT 
GCTATCCGAA 
TAATATGTTT 
TGGACGTTTC 
ATAGACGGTG 
TCGAGATCAT 
TTGTTGCTTC 
TACTACTTAA 
GCAAACTAAG 
CTCAGTTATT 
ACCGAAGTAA 
ACTGTGTTGC 
ATAAAAAGCT 
CTACGTTATT 
CATGAGTGAG 
TTGCTTTCAC 
TATGTGAAAC 
TATTTATGAT 
TTGCGTTGAG 
GTAGATTTTT 
AGAGCACACT 
CTAATCTGGA 
CGGACAGGGA 
TCTCATGATA 
TTTTTTTTGA 
AGGAATCTCA 
ACTTACTTAG 



TTTGATCTCA 
GAAAAAATAA 
CTCGTGGCAA 
TCGTAAAATT 
TCATCAAAAA 
GAAGAATTAG 
TGCTCAAGAA 
AAAAAACTCA 
GAATTGTTTG 
TTTTTATTCT 
TTTGTCATTT 
GGATTATTCG 
AGACCTACCG 
ATTTAACCTA 
CTAGGCAAGG 
TTTAATTTTT 
TAATTCAAGG 
AATGATGTGC 
ATATAGCTTA 
CTTCATTGAA 
GCAAAAGCTT 
TGAATTCAAT 
AGATTAGCCC 
AAAAGAAGAG 
TT.TCAAAACT 
GCAAGGCTGG 
TCTTTCTTCA 



TTATAGGAGC 
TTAAGTTTTT 
TTTAAAGAGA 
AATCGACACA 
CTTTGATCAT 
AGGAAAAGCT 
TTTCAGAATC 
ATGGTTGCCA 
CCATGTTAAC 
CCCGGATGTA 
AAATGATTCT 
AACAACAGTG 
CTTCTTTTAG 
TAGGAATATC 
AGCTCGGGGA 
AAAGAGATTG 
ACTCAAGCGT 
CTACTTTACC 
GCGAATACAA 
GTTTATAAGC 
TAAACTCTGG 
CTTGGAATGA 
AGAAATTTTA 
TACGTTCTTT 
AATAAAATAA 
GAGTCGATAG 
GTTTTAGGAA 
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CTAGGTTTCA 
TTAAAAAATC 
ACAAAAAGGA 
CTCAGTTTTA 
AAGCCTAGTG 
TTTAACTATA 
GTAAGACGGA 
TTCAAAAATG 
TTCGATGGAT 
GTAGCGACTG 
ATAGGTTTGG 
TGAGCATGTT 
GAACTACCGT 
TCTCTTTTGA 
TGTTTTAAAG 
TGGATATAGA 
AGTGGAAATA 
TTCAGTTTCA 
TTCGGGGCCT 
CCGTCTATAC 
CGGAGAATGT 
AAATAGTCAT 
AATAAAAATA 
ATATATTTAA 
ATAGAATTTT 
ATCTCTTACT 
GGTTCCGAAT 



AAAGACAAGG 
GAAAAAATTT 
TTACATCACA 
CGTGGATTCG 
AAGATAAATC 
ACAAAACGTA 
CTCTAAAAAC 
AAGAGTTGGA 
AAAAAAATAG 
GGTAGAATTT 
GAGGGGTTTT 
GTTACTGTGA 
TGTAAATAGT 
ATTGTCAGAG 
CAACATGGAG 
TTTATTAAAC 
TTCAAGCACG 
TCGAGTCCTA 
TGCTTTACAT 
TTTCCAATAC 
TTTATCTTTT 
GCAACTATTG 
TCATGAAAAT 
TTATTTGGTC 
AGATCTCTGA 
TGTTTTTCTA 
TTTAGCAATC 



2 64551 AACCGATGTG TTTCTTGATA AGGTCGTGCT GGAGCGCCTC CATAAATGCC 

2 64 601 TGGAGAGGTG ATAGATTTTG TGACTCCAGT TTGAGCAATC ATGATCACAT 

264651 GGTCTGCAAT AGAAATATGC CCAGTAATTC CGGTTTGCCC TCCAATGATG 

2 647 01 ACATGTTCAC CAATTTTTGT AGAACCTGCA ATGCCTGCTT GGGCAACAAT 

2 64751 AATACTATGC TTTCCAATTT CTACGTGATG AGCTACTTGT ACTTGGTTAT 

2 64801 CTATTTTAGT TCCTTCATGG ATCACGGTGT TCTTGAATCG ACCACGATCT 

2 64851 ATCGTAGTGT TGGCTCCGAT TTCTACATCA TCACCTACAA TCACATAGCC 

2 649 01 TAGATGCTTT AAAGGTTTGT GATGACCAAA AGCATTTGTA ATATAACCAA 

2 64951 AACCACAGGA TCCTAAAACA GCTCCAGGTT GAACAACTAC ACGGTTTCCC 

265001 ATGAGGACTC TTTCTCGAAT CACCACCTTA GGGTGAATCA GACAGTTAGC 

2 65051 ACCTAGAACG CTGTGAGCTC CAATGACACT TCCAGCTCCG ATGTATGTGT 

265101 CAGAGCCGAT ATGGGCATGT TGACTAATGA CAACGTAAGG TTCTATGGTT 

2 65151 ACATTTTTCT CAATACGTGC AGTAGGATGA ATCACTGCAG TAGGATGAAT 

2 652 01 ACCAGGAAAC CCTGATGTTA CGGGTTCAAT AAACAACTCT ATGCACTTTT 

2 65251 GAAATGTTAG AGAAGGGGAT TCATTGGTAA TAAGAAAGTT TTTCTTTAGG 

2 65301 TGGGCATGTT GCATTGCCTG AGATCTAGAT AAAATAATAG CACCAGCTTT 

2 65351 GGTGTTTTTT AGAAAGCTAG AGTATTTCTC ATTATCTAAA AAAGCAATAT 

2 65401 GGTGAGGTTG CGCCTGACTA ATATCTTCAA CACCTGAAAT AGGAGTTTCT 

265451 ATATTTCCTT GAACTTCGAC TTGTAGTAGC TCAGCTAACT GTTTAAGAGT 

2 65501 GTAGACTGGT GCTTCGGACA TAGAAAACTC CTTAAACTTG GACTAGTTTT 

2 65551 GTTTTTTGAA AGATTCGTTA AGAATAGCAA TAATTTCGGT TGTTTTATCA 

2 65601 GTCCCAGGTG CTATTGCTAA GACAGCTTCT TCATTAAGGA TAGCTTCTAG 

265651 TTTTTCTTTG 'GACCGCACTG ATTCTGCAGC TATTTTTACT TCTTGAATGA 

2657 0'! GTTTTTGAAT GCGTTTTACA TGTACTTTGA TTGATAGATT GATAGTACTG 

2 65751 AGACTGGTAC GCATTGTACT CTCCTGAAAG ATCTTCGAAT TTCTTTCGCA 

2 65801 ACTCTTCAGA GGCAGAATCC GATAGGCTTT CCATGTAATC TTCATCTTGC 

2 65851 AACTTATTAT AAATAGAAGT GAGTTCTTCT TCTATTTTCT CAGCATTTTT 
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265901 
265951 
266001 
266051 
266101 
266151 
266201 
266251 
266301 
266351 
266401 
266451 
266501 
266551 
266601 
266651 
266701 
266751 
266801 
266851 
266901 
266951 
267001 
267051 
267101 
267151 
267201 



TACAAACTGC 
GATCGGATTC 
TGAGCTGCGC 
TTTTTTCATA 
AGCCTTTTCT 
ATCTTAGAAC 
TTTTTTCTCC 
AACATAACAG 
ACTACTACGT 
AACCTGAGTC 
GGGTATTGAA 
TTCTGTAGCA 
CTGTAGTCTC 
GCTGTAGTAT 
CAAAATACCT 
GTTTTGTAAA 
CCCCCGCGAA 
ATTCAAGTTG 
TTGGCCCTAG 
CGATAAAATA 
GACGTTCCCG 
TAATTGATTT 
TGAGGTTTGG 
GGCTTTTAAG 
TAGAAAATAT 
TCAATTCCTC 
TAAGTTTCCT 



TGTTTCATAG 
TTCAAGACAT 
TTGTTGATCC 
ATTACCTGTA 
GAAAACAACA 
ATGCCCCCTA 
ATTCAAAGTC 
GAACATTATT 
AGATCTTTTA 
TAAGAATACA 
ACTCTTCTGA 
GAGTATTTTG 
TCCACCTAGG 
TGCTATAGGG 
TTACGCGTAA 
ATGATAAGTT 
TCCCTGTAGT 
ACACCTGCAG 
GAGGAACTTA 
GACCGTATTT 
CCATAGGTTT 
ATCTAATTCA 
TCCACTTCAA 
AATAGATGTT 
ATTTCTAGCT 
CAAAAAGATT 
GTTGTTGTTT 



CTTCCAATTC 
CGCTTTAAAT 
TAAAACAAGA 
CTAACTTGGG 
AAGATTTCCT 
AAGCAAAGAA 
TCGGTTGGAC 
CATTACATCG 
ACGAAATCTT 
AAGGCACTAA 
AATAAGGAGC 
GACCGATAAT 
AAGAAGCGCT 
TTTAATAAAT 
GTTTTCTATA 
CCTCCCAAAC 
TGGAGTTCTA 
CAGAGACAAA 
CGTTTTTCAT 
CAGGTGTTCG 
GGACAGCATA 
ATTCCTAAAA 
AGTATAGTQT 
CTCCACCGCC 
CCAAATAGAT 
GTCAAGAGAA 
CTTTGACTTC 



TTCAGTTTCC 
TAACATAGCC 
AGAAATGTAG 
TATAGAAAAA 
TCGATAAGTC 
GAATCGCTGA 
GGAAGGGCCA 
AAGCGCAGAC 
ATACTCTTGT 
TATTAGGTTG 
GAAGAGAGTC 
AAAGGATTTA 
CACTGACAGG 
TGAGCTTCCC 
GATAGAGCTG 
CAGAAACCTC 
GGACTATCTA 
TCCTTTATTG 
GTAAACTCGT 
TTCAAGATAT 
ATCTTTAGAT 
TCCAAGGAGT 
GTGACTTTGT 
TCTTAGACAA 
CAAAATTACT 
CTAAATCCTA 
TACAAAAATA 



TTTTTACCTA 
TAAATTTGCA 
AAAATAATAA 
GGGTACCAAA 
CTTAATTTAT 
GATACATCAA 
ACCAAATCCT 
CAAATCCAGC 
AAACCGACAA 
TCTGATGAGA 
CTCCCTGAGG 
TATCCCCGAA 
AACTCCTTCA 
CTTTGATTTT 
TTTAAAGAGA 
AAAAGTCACC 
CAGAATCGTA 
CTGTCTATAT 
TTGACTTCCT 
ACGTTGTGCT 
AATGCTCTGT 
GTTTAGAAAA 
CCCCGAAGTT 
CGAAAACCTT 
TTCAGATAGT 
AGAATAAGCC 
TCTCGGTATT 
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2 67251 GATCCGCATT GCCCATAGGA TCAAGTTGAG AACGAACTGT ATAGACACTA 

2 673 01 ACGCTTTGGA AGTAGCCTGT ATTTCTTAAA CGTTGCTCAG TATCTTCTAG 

267351 CTTTAAGCGA TTGAATGTAT CTCCTGGGAA GAGACTGGTT TCGTGTAAAA 

267401 TAACGTCAGA TTTTGTATGG GTATTCCCAG TAATTTTAAT TAACCCAACT 

267451 TTATAAGGAG ACCCTTCACT TACCTCATAA GTTACATCAT AAATAGGGCG 

2 67501 GGTTGCGTGA GGGATGAAGA GAACGTCTAC ATTGGTATTG ATGTAGCCAT 

267 551 ACTTTGCATA AGTTTGTTTG ATCTTATGAG CCCCATCCCA TATTTTATCG 

267 601 GGGCAATAAA GATCATTGGG GCCGACTTGG GATTGCTTTT CTATAAGGCG 

267 651 TTTTGGCAAA ACCTCAAACC CTTGGATATG GACGTGTCCT AAGGTATATC 

2 677 01 GCGACCCTCG ATCAATATCC ATGTAAAGAA GAATATTCCC TTTGTCGTCA 

2677 51 AGGTCATAGT GAGAGTTGAC TATAGCATCA GCGTACCCGT TATTATGTAG 

2 67801 GTAATTCGTA ATTGCCAAGC TATCTTGTTC AACAATATCT GGGTGATAGA 

267 851 GTCCAGCTCC AGTAAACCAA CTTGTAGTTG TAGAGTGCTG CTTGGTTTGA 

267901 ATAAATTCTT GGATATCTGA TTTTTCTGAT CGAGAGATTC CTGAGAACGT 

267951 AAGCTGTTTA ATTTTCCCGC AAGGACCTTC ATTGATTTTA ATTAAAACAT 

2 680 01 CGATGTGACC TTTTTCTTGA TTGTGTTCCA GACTGTAGTC TACACTGGAT 

2 68051 GCGAAATATC CTCGCTTGAG ATAATACGTT CTTAGATCAT CAAGACCCTT 

268101 AAGAAATTTT TCTCGTTCAA AGAGATCATT ACGGTAAATT TGTAGGGTTT 

2 68151 TAAGAATTTT ATGTTCAGGA ACGACTTGAT TTCCTGAGAT ATGAATATTT 

2 68201 CGAATTGAGG GTTTAGCTAT TAGGTGAAGG GCTATGTTAG TTTTCCCTTC 

2 68251 AGAAAATTCT ACTTTAGGCT CAACAGAGTC GTATTCTTTA GCTAGAATTC 

2 683 01 TCAAGTCTTC ATCAAAATCT AATTGAGAAA AAAGAGCCCC ACTTCTGGTC 

2683 51 TTTAATTTGG GTAAGGGATG TTTATTTGAA GCATTTTCTC CTTCCGTTAT 

2 68401 GATTGTGATA GAGTCTACCA CCACATGGCC TTCTTTAACT TTTTCAGTAG 

2 684 51 AAAATAAAGT TAAAGGGGTT TGGATTAACG CTAGAATAGA TATTTGCAAG 

2 68501 ATAACTTTAT TTCGCATGAT GAGCATTCCC AAGAGTCTTC CC TAG AT AG A 

2 68551 AAGCTTGTTT ATCAGATAAT GAGGTAAGCA CAAAAATAGG GACAAAGAAG 
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268601 
268651 
268701 
268751 
268801 
268851 
268901 
268951 
269001 
269051 
269101 
269151 
269201 
269251 
269301 
269351 
269401 
269451 
269501 
269551 
269601 
269651 
269701 
269751 
269801 
269851 
269901 



TTTTTTAACA 
TTTAAATTGT 
GCTCAGACTA 
GCTTAATAAA 
GGCAAATTCG 
ATAGAAGAGA 
TGTCGTCCAG 
AGATAAGCCT 
AATGTTGTAA 
GTTGCATCAA 
TGATTTTAAA 
GCGATAAGAG 
CGTTCTAGAA 
GTTATCTCTT 
TGAGAGTAAA 
AAAGCGTTAC 
AAAAGCAAGT 
AAAAGAAAAT 
TGAATTTTAT 
TAAACTCTCT 
TGAATAAAAA 
GAGAAAGTTC 
TGAGTGGATC 
CTCAGGAGTA 
GCAAATGCGG 
AGCAGCACCA 
ATTTAGGCAT 



AGACCTTACT 
GTTTGATTTT 
GAGCCTGAAT 
CCTTGTCGCT 
ATAATTCACT 
ACCTTGTTTA 
AAAAAGCTCT 
ATAGGAAGAC 
TTCTTGTTTT 
TGGCTAGGAT 
ATGGAGAGAC 
TGAACCAAGA 
AGAAAACATC 
TCTTCTCTAC 
ACATAGGGGA 
CTAATATTTT 
TTTTCTGCTG 
TAATTTAGAT 
TTTTACATTC 
GAATAGTATA 
CAAAAAAGCA 
TTTCAAACGC 
GTGACCAGAA 
CACTTCTCTT 
GTTTAAGCAA 
GATTATATTT 
TGAGGATGTC 



ACTTAGATTA 
TTCCCGTCCT 
ATTTCGGCGG 
CAAATAACTC 
GTGTCCTAGG 
CTTTTAGGAT 
TGCCAGTGTC 
CTAAAGCTAG 
AGAAAAAGGG 
AATTTCTTTT 
GCTCGTTTTC 
ACATGATAAC 
TTTTGGAGAA 
AAAAGTGACA 
CAGTGACTAC 
TAATTGTTCG 
TTTTAAATCC 
AAGTAATCTG 
GGGTCTGGGG 
CTAGCTTTTT 
GCAATTTGGG 
AGATTTAGAA 
CGGGGATCAA 
ATGGGAGCCA 
GGATCAGATT 
TCCCATCAAG 
CCTACATTTG 



TTTTTGTATT 
AACACCGGGA 
GTTTTTTGCT 
TACTGATTAC 
GGAAAGTAAG 
AAGAGAGCTG 
CCTGAATCTA 
ACGGGAAATA 
CAGTAGCATC 
GGGCATAGCG 
TATATGTTTC 
GTCCCTTGAA 
GCGACAATAC 
GTCTGCCTCT 
GCTCACTAGC 
CTGTCCCAAG 
AATTCCTGGA 
GATATCTTGT 
CCTAAATTAA 
TCTTTACATG 
CAACGGGTTC 
AAAATGGTAG 
AGAGCGTCGT 
TCGCTGCAGA 
GACTGTATCA 
CGGAGCTCTT 
ATTGCCAGGC 



TCTAAAGAAA 
TTATTACTGT 
TCCTTAAAAA 
ATGGTTTTAA 
AAAGACTAGC 
CTAATAGGAG 
CATAATCAAA 
TTTACAGAGA 
TCCTTCTAAG 
TTTCTATGCG 
CCTGTAATGG 
TACTTTAGAA 
ATAGACTTTG 
TTAGATTCTT 
AACATTATGA 
AGATGAGTTC 
AGTTTTCGTA 
CATAGTGATG 
GGTTAGAGTA 
TGGTTCTCTG 
CTATTTGCCT 
ATACCTCTGA 
ATTGCTGGAC 
GAAAGCTATA 
TTTTCTCGAC 
GCTCAAGCAC 
GGCTTGTACT 
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2 69951 GGGTATTTGT ATGGTTTGTC TGTAGCTAAG GCTTATGTAG AATCAGGTAC 

270001 ATATAACCAT GTATTGTTAA TTGCTGCTGA TAAGTTGTCT TCTTTTGTAG 

270051 ATTATACAGA TCGGAATACC TGTGTGTTGT TTGGAGATGG AGGAGCTGCT 

27 0101 TGTGTCATAG GGGAGAGTCG GCCAGGATCT TTAGAGATTA ATAGGTTGTC 

27 0151 TTTAGGCGCA GATGGTAAGC TAGGAGAGTT ATTAAGCCTT CCTGCTGGAG 

27 02 01 GTAGTCGTTG TCCTGCTTCT AAAGAGACTT TACAATCAGG CAAACATTTT 

27 0251 ATTGCTATGG AGGGAAAAGA AGTTTTTAAG CATGCTGTGA GACGTATGGA 

27 0301 AACGGCAGCT AAACATTCGA TAGCCCTGGC AGGCATTCAG GAAGAGGATA 

27 0351 TAGATTGGTT TGTACCTCAT CAAGCTAATG AAAGAATAAT AGATGCTTTA 

27 0401 GCGAAGCGTT TTGAGATTGA TGAGTCTAGA GTGTTTAAGA GTGTACATAA 

27 0451 GTATGGAAAT ACTGCGGCCT CGTCTGTGGG CATTGCTTTG GATGAATTAG 

27 0501 TTCATACAGA ATCCATTAAG CTTGATGATT ATTTACTTTT AGTTGCCTTT 

270551 GGGGGCGGTT TGTCTTGGGG CGCAGTAGTT TTAAAGCAGG TCTAATAAGG 

270601 ACGATAATTT CATGAAAAAA CGTTATGCTT TTTTGTTCCC AGGACAAGGG 

27 0 6 51 AGCCAATATG TAGGTATGGG ACAAGACCTA TATATGGAGT ATCCTGAGGT 

27 0701 TAGAGAGCTT TTTGATTTTG CTAATGAAAG GTTAGGATTT TCTCTGACTT 

27 0751 CAATTATGTT TGAAGGTCCT GAGGATCTTT TGATGGAAAC AGTACATAGT 

27 0801 CAGCTAGCTA TTTATCTTCA TAGCATGGCT GTGGTAAAGG TTCTATCTCA 

27 0851 GCGTTCTTCT ATTCAGCCTT CTTTAGTCTC TGGATTAAGT TTAGGGGAGT 

27 0901 ATACTGCTTT AGTTGCTTCC GATAGAATCT CCGTGCTCGA CGGCCTTGAG 

27 09 51 CTTGTTAGAA AGCGTGGTCA GTTAATGAAT GAAGCTTGTA ATCAGAGCCC 

271001 AGGGGCTATG GCGGCTTTAT TAGGGCTTCC CTCTGAAGTT ATAGAGGAAA 

271051 ATATAACAAG TCTTGGTCAA GGAATTTGGA TTGCTAATTA TAATGCACCC 

271101 AAACAGCTTG TAGTGGCTGG AATAGCAGAA AAAGTAGACC AAGCGATTGA 

271151 GTTATTTCGT GATTTAGGAT GTAAAAAAGC AGTTCGTTTA AAGGTGTCTG 

271201 GAGCATTTCA TACTCCTTTA ATGCAAGTTG CTCAAGATGG CTTAGCTCCA 

2712 51 GACATTTATG CTTTATGCAT GAAAGATTCT AGCCTTCCCT TAGTGTCACA 
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271301 
271351 
271401 
271451 
271501 
271551 
271601 
271651 
271701 
271751 
271801 
271851 
271901 
271951 
272001 
272051 
272101 
272151 
272201 
272251 
272301 
272351 
272401 
272451 
272501 
272551 
272601 



CGTGGTAGGA 
CTCGGCAAAT 
GAATCAGAGG 
TGGTTTAAAT 
CTTTTGCTCA 
TAACATTAGT 
GGACTCGGGA 
TTGGGGATTG 
GCTTGGGTGG 
GGAGTGAAAG 
TATTTTGGTA 
TGTCTGAGGA 
TATTATACAT 
ATCTATTATA 
AGACCAACTA 
TTAGCTAAGG 
AGGCTTTATT 
CTGAGTGGCT 
GTTGCTCGTG 
CGCGCAGACA 
GAAAGGGATT 
TAAAAAATGA 
AATAGTGAAG 
TTTAGAAGAT 
CAAAAGAAGT 
AGTTTAGATT 
TGAAATTTCA 



AAATCTTTAG 
GACATCACCT 
TGGATGAGTT 
CGCTCTATAG 
GATTGAAAAA 
AGGCAAAAAA 
TAGTTAAGCT 
AATGAGGAGC 
CGAAGTTTCT 
ATTGCGTGCA 
AATAATGCAG 
CGACTGGCAA 
GTTCCTCAGT 
AATGTGGCTT 
TGCTGCTGCT 
AAGTAGCTGC 
GAAACAGACA 
TAAGTCGATC 
TGGCGTTGTT 
CTGGTTGTTG 
TGAAAATTTC 
TTTTTTGCGA 
TATTGTATAG 
GATGTAATAG 
TAATGAGAAC 
TAACAGAATT 
GAAGAAGATG 



TAAATACTGA 
ACGTTATGGT 
TTTAGAATTA 
GGATTTCTAA 
TTCCTATCAG 
GTTATAGTAA 
TTTTCTTGAG 
GAGGTCAGGC 
TTTGCTCGTG 
GAAATTTTTA 
GCATTACCAG 
TCGGTGATTA 
GATTCGCCAT 
CTATTGTTGC 
AAAGCTGGGA 
AAGAAATATT 
TGACAAGCGT 
CCTTTAGGTA 
TTTAGCCTCG 
ATGGGGGATT 
TCTTCGAGAA 
TACTAATTCT 
TTTAAATAGT 
CAATTATTGT 
TCTTCTTTTA 
GATTATGACT 
CTGAGAAGCT 
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AGAAATGCGA 
ATCAGAGTTG 
GGTCCAGGAA 
ACCGATTACA 
AGGTATGATT 
CTGGAGGATC 
AACGGAGCAG 
TGTTATAGAA 
TGGATGTGAG 
GATAAGCACA 
GGATAATTTG 
GCACCAACTT 
ATGATTAAGG 
TAAGATCGGT 
TTATTGCTTT 
CGTGTCAACT 
GTTGAATGAC 
GGGCTGGCAC 
CAGTTATCGA 
GACTTACTAA 
CTAATTAAGT 
CTTTCTCTTT 
AAAAGGATAT 
TGAGCAGTTA 
TTGAAGACTT 
TTAGAAGAAA 
TCGTACTGTC 



GAGTGTTTAG 
TTACCATATC 
AAGTTTTGGC 
AGTCTTGGTA 
TGTATGGATA 
TCGAGGAATT 
ATGTAGAAAT 
AGTTTAACAG 
TCATAATGGT 
ACAAAATAGA 
TTGATGCGTA 
GACTTCCTTG 
CGCGTTCAGG 
AGTGCGGGCC 
CACAAAATCT 
GCCTTGCTCC 
AATTTAAAAG 
TCCAGAAGAT 
GCTATATGAC 
GACAATAGAA 
AACCGTCGAA 
GTCCCTAGGG 
AAGCAATGAG 
GGAGTGGATC 
GAATGCTGAT 
AATTTGCTTT 
GGGGATGTAT 



272651 
272701 
272751 
272801 
272851 
272901 
272951 
273001 
273051 
273101 
273151 
273201 
273251 



TTACTTATAT 
GGGTGCGTCG 
GCGTGCCTTT 
TTTTTAAGGT 
ATAAAGCTAC 
CTAAGAACTA 
GGGCTTATTA 
AATTTAGAGG 
ACAATGATAT 
GCCAGGCTTA 
GGTCCATGTC 
CGATCGATCA 
TTTT 



TAAGAAACGT 
TAATCTTTTT 
TTTTCTATAG 
TCTCTGAACT 
GGATGGGCAC 
GCATGCGGAC 
TTGAATAAGC 
AGACTCTAGT 
AAAAGCTGAA 
AATATTATCG 
TAAAGATTGG 
AATTCATAAA 



CAAGCTGAAC 
TAAAGCTATT 
AGTAGCTCAT 
TGATTTGTTT 
TCTTCCACAA 
TTGTGTATTT 
TTTCCTCTCC 
TTTTCTTTAG 
TCCAGGTTGT' 
TTTCAGTTTT 
AATATAATCG 
AAAGTTCCTT 



AATAAACTTT 
GTTTTTCAAT 
CTAGAAAGAT 
AGCATAGAGC 
TGTTTAGAAT 
GCAGAAGCAT 
AAAACAATCT 
AGATCGTAAT 
CCTATAGAGA 
ATCGGCAATT 
TTTTTTTTAG 
ATTCACACCA 



CTTATATTCT 
AGTGATTACG 
TTATTTTGTC 
TCTAAAAAAG 
TTGTCCTTTG 
TGTATTCCCT 
AAAGGTTTTA 
GTATCCTTCT 
ATACATTGCT 
GTTAAAAGAA 
TAGAAAGGCG 
TAGTTTTAGT 
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